.

.

Wallaroo MLOps API Essentials Guide

Basic Guide for the Wallaroo API

This tutorial and the assets are available as part of the Wallaroo Tutorials repository.

Wallaroo MLOps API Tutorial

The Wallaroo MLOps API allows organizations to submit requests to their Wallaroo instance to perform such actions as:

  • Create a new user and invite them to the instance.
  • Create workspaces and list their configuration details.
  • Upload a model.
  • Deploy and undeploy a pipeline.

The following examples will show how to submit queries to the Wallaroo MLOps API and the types of responses returned.

References

The following references are available for more information about Wallaroo and the Wallaroo MLOps API:

  • Wallaroo Documentation Site: The Wallaroo Documentation Site
  • Wallaroo MLOps API Documentation from a Wallaroo instance: A Swagger UI based documentation is available from your Wallaroo instance at https://{Wallaroo Prefix.}api.{Wallaroo Suffix}/v1/api/docs. For example, if the Wallaroo Instance suffix is example.wallaroo.ai with the prefix {lovely-rhino-5555.}, then the Wallaroo MLOps API Documentation would be available at https://lovely-rhino-5555.api.example.wallaroo.ai/v1/api/docs. Note the . is part of the prefix.
  • For another example, a Wallaroo Enterprise users who do not use a prefix and has the suffix wallaroo.example.wallaroo.ai, the the Wallaroo MLOps API Documentation would be available at https://api.wallaroo.example.wallaroo.ai/v1/api/docs. For more information, see the Wallaroo Documentation Site.

IMPORTANT NOTE: The Wallaroo MLOps API is provided as an early access features. Future iterations may adjust the methods and returns to provide a better user experience. Please refer to this guide for updates.

Prerequisites

  • An installed Wallaroo instance.
  • The following Python libraries installed:
    • requests
    • json
    • wallaroo: The Wallaroo SDK. Included with the Wallaroo JupyterHub service by default.
    • pandas: Pandas, mainly used for Pandas DataFrame. Included with the Wallaroo JupyterHub service by default.
    • pyarrow: PyArrow for Apache Arrow support. Included with the Wallaroo JupyterHub service by default.
    • polars: Polars for DataFrame with native Apache Arrow support

OpenAPI Steps

The following demonstrates how to use each command in the Wallaroo MLOps API, and can be modified as best fits your organization’s needs.

Import Libraries

For the examples, the Python requests library will be used to make the REST HTTP(S) connections.

import wallaroo
from wallaroo.object import EntityNotFoundError
import pandas as pd
import os

import pyarrow as pa

import requests
from requests.auth import HTTPBasicAuth

import json

# used to display dataframe information without truncating
from IPython.display import display
pd.set_option('display.max_colwidth', None)

Notes About This Guide

The following guide was established with set names for workspaces, pipelines, and models. Note that some commands, such as creating a workspace, will fail if another workspace is already created with the same name. Similar, if a user is already established with the same email address as in the examples below, etc.

To reduce errors, the following variables are declared. Please change them as required to avoid issues in an established Wallaroo environment.

For wallarooPrefix = "YOUR PREFIX." and wallarooSuffix = "YOUR SUFFIX", enter the prefix and suffix for your Wallaroo instance DNS name. If the prefix instance is blank, then it can be wallarooPrefix = "". Note that the prefix includes the . for proper formatting.

## Sample Variables List

new_user = "john.hansarick@wallaroo.ai"
new_user_password = "Snugglebunnies"

example_workspace_name = "apiworkspaces"
model_name = "apimodel"
model_file_name = "./models/ccfraud.onnx"

stream_model_name = "apiteststreammodel"
stream_model_file_name = "./models/ccfraud.onnx"

empty_pipeline_name="pipelinenomodel"

model_pipeline_name="pipelinemodels"

example_copied_pipeline_name="copiedmodelpipeline"

wallarooPrefix = "YOUR PREFIX."
wallarooSuffix = "YOUR SUFFIX"

# Retrieving login data through credential file
f = open('./creds.json')
login_data = json.load(f)

Retrieve Credentials

Through Keycloak

Wallaroo comes pre-installed with a confidential OpenID Connect client. The default client is api-client, but other clients may be created and configured.

Confidential clients require its secret to be supplied when requesting a token. Administrators may obtain their API client credentials from Keycloak from the Keycloak Service URL as listed above and the prefix /auth/admin/master/console/#/realms/master/clients.

For example, if the Wallaroo DNS address is in the format https://{WALLAROO PREFIX.}{WALLAROO SUFFIX}, then the direct path to the Keycloak API client credentials would be:

https://{WALLAROO PREFIX.}keycloak.{WALLAROO SUFFIX}/auth/admin/master/console/#/realms/master/clients

If the there is no prefix, then the address would simply be:

https://keycloak.{WALLAROO SUFFIX}/auth/admin/master/console/#/realms/master/clients

Then select the client, in this case api-client, then Credentials.

By default, tokens issued for api-client are valid for up to 60 minutes. Refresh tokens are supported.

Token Types

There are two tokens used with Wallaroo API services:

  • MLOps tokens: User tokens are generated with the confidential client credentials and the username/password of the Wallaroo user making the MLOps API request and requires:

    • The Wallaroo instance Keycloak address.

    • The confidential client, api-client by default.

    • The confidential client secret.

    • The Wallaroo username making the MLOps API request.

    • The Wallaroo user’s password.

      This request return includes the access_token and the refresh_token. The access_token is used to authenticate. The refresh_token can be used to create a new token without submitting the original username and password.

      A sample curl version of that request is:

      eval $(curl "https://${URL_PREFIX}keycloak.${URL_SUFFIX}/auth/realms/master/protocol/openid-connect/token" -u "${CONFIDENTIAL_CLIENT}:${CONFIDENTIAL_CLIENT_SECRET}" -d "grant_type=password&username=${USERNAME}&password=${PASSWORD}&scope=offline_access' -s | jq -r '"TOKEN=\(.access_token) REFRESH=\(.refresh_token)"')
      
      • Tokens can be refreshed via a refresh request and require:
        • The confidential client, api-client by default.

        • The confidential client secret.

        • The refresh token retrieved from the initial access token request. A curl version of that request is:

          TOKEN=$(curl "https://${URL_PREFIX}keycloak.${URL_SUFFIX}/auth/realms/master/protocol/openid-connect/token" -u "${CONFIDENTIAL_CLIENT}:${CONFIDENTIAL_CLIENT_SECRET}" -d "grant_type=refresh_token&refresh_token=${REFRESH}" -s | jq -r '.access_token')
          
  • Inference Token: Tokens used as part of a Pipeline Inference URL request. These do not require a Wallaroo user credentials. Inference token request require the following:

    • The Wallaroo instance Keycloak address.

    • The confidential client, api-client by default.

    • The confidential client secret.

      A curl version of that command is:

      TOKEN=$(curl "https://${URL_PREFIX}keycloak.${URL_SUFFIX}/auth/realms/master/protocol/openid-connect/token" -u "${CONFIDENTIAL_CLIENT}:${CONFIDENTIAL_CLIENT_SECRET}" -d 'grant_type=client_credentials' -s | jq -r '.access_token')
      

The following examples demonstrate:

  • Generating a MLOps API token with the confidential client, client secret, username, and password.
  • Refreshing a MLOps API token with the confidential client and client secret (the username and password are not required for refreshing the token).
  • Generate a Pipeline Inference URl token with the confidential client and client secret (username and password are not required).

The username and password for the user are stored in the file ./creds.json to prevent them from being displayed in a demonstration.

## Generating token with confidential client, client secret, username, password

TOKENURL=f'https://{wallarooPrefix}keycloak.{wallarooSuffix}/auth/realms/master/protocol/openid-connect/token'

USERNAME = login_data["username"]
PASSWORD = login_data["password"]
CONFIDENTIAL_CLIENT=login_data["confidentialClient"]
CONFIDENTIAL_CLIENT_SECRET=login_data["confidentialPassword"]

auth = HTTPBasicAuth(CONFIDENTIAL_CLIENT, CONFIDENTIAL_CLIENT_SECRET)
data = {
    'grant_type': 'password',
    'username': USERNAME,
    'password': PASSWORD
}
response = requests.post(TOKENURL, auth=auth, data=data, verify=True)
access_token = response.json()['access_token']
refresh_token = response.json()['refresh_token']
display(access_token)
'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJDYkFqN19QY0xCWTFkWmJiUDZ6Q3BsbkNBYTd6US0tRHlyNy0yLXlQb25nIn0.eyJleHAiOjE2ODQzNjAxNjUsImlhdCI6MTY4NDM1NjU2NSwianRpIjoiZGQxMDFkODMtMzk5ZC00N2M2LThlZDMtNjQxMGRmNThhYmViIiwiaXNzIjoiaHR0cHM6Ly9kb2MtdGVzdC5rZXljbG9hay53YWxsYXJvb2NvbW11bml0eS5uaW5qYS9hdXRoL3JlYWxtcy9tYXN0ZXIiLCJhdWQiOlsibWFzdGVyLXJlYWxtIiwiYWNjb3VudCJdLCJzdWIiOiIwMjhjOGI0OC1jMzliLTQ1NzgtOTExMC0wYjViZGQzODI0ZGEiLCJ0eXAiOiJCZWFyZXIiLCJhenAiOiJhcGktY2xpZW50Iiwic2Vzc2lvbl9zdGF0ZSI6Ijk4MDhkZTA5LWU2NjYtNGIyNC05ZWQ4LTc2MmUxZjllODk0ZSIsImFjciI6IjEiLCJhbGxvd2VkLW9yaWdpbnMiOlsiKiJdLCJyZWFsbV9hY2Nlc3MiOnsicm9sZXMiOlsiZGVmYXVsdC1yb2xlcy1tYXN0ZXIiLCJvZmZsaW5lX2FjY2VzcyIsInVtYV9hdXRob3JpemF0aW9uIl19LCJyZXNvdXJjZV9hY2Nlc3MiOnsibWFzdGVyLXJlYWxtIjp7InJvbGVzIjpbIm1hbmFnZS11c2VycyIsInZpZXctdXNlcnMiLCJxdWVyeS1ncm91cHMiLCJxdWVyeS11c2VycyJdfSwiYWNjb3VudCI6eyJyb2xlcyI6WyJtYW5hZ2UtYWNjb3VudCIsIm1hbmFnZS1hY2NvdW50LWxpbmtzIiwidmlldy1wcm9maWxlIl19fSwic2NvcGUiOiJwcm9maWxlIGVtYWlsIiwic2lkIjoiOTgwOGRlMDktZTY2Ni00YjI0LTllZDgtNzYyZTFmOWU4OTRlIiwiZW1haWxfdmVyaWZpZWQiOmZhbHNlLCJodHRwczovL2hhc3VyYS5pby9qd3QvY2xhaW1zIjp7IngtaGFzdXJhLXVzZXItaWQiOiIwMjhjOGI0OC1jMzliLTQ1NzgtOTExMC0wYjViZGQzODI0ZGEiLCJ4LWhhc3VyYS1kZWZhdWx0LXJvbGUiOiJ1c2VyIiwieC1oYXN1cmEtYWxsb3dlZC1yb2xlcyI6WyJ1c2VyIl0sIngtaGFzdXJhLXVzZXItZ3JvdXBzIjoie30ifSwibmFtZSI6IkpvaG4gSGFuc2FyaWNrIiwicHJlZmVycmVkX3VzZXJuYW1lIjoiam9obi5odW1tZWxAd2FsbGFyb28uYWkiLCJnaXZlbl9uYW1lIjoiSm9obiIsImZhbWlseV9uYW1lIjoiSGFuc2FyaWNrIiwiZW1haWwiOiJqb2huLmh1bW1lbEB3YWxsYXJvby5haSJ9.lfesLVvWM21kbWIhQtT2ap-ruT5_qt7CVcPUt1mAS8KoksuiJIb4QxPV9FwGB1I7sPiWjXR60cR-cjNLLoTgCX9GZZbfISDR4NvqN5ZBANDzYCx64WtTZCaDPeWROClHvmmE6Mfs1mAdgC3fIxkDe6Ns5-S6wnDqW7v6-yaNo5gBywftaCFyD3lFsmpmBvcyXphtn7sUlX_W4Ku9xmaalUkLv1F8528thZAARN5Jl-_uTHNKCe5wYGiEpQkbeIZ_Rjzqnctx-onw3cVKgbS6_wATr0TZQxgR2AY459OkCJ3rcuJTTTI5PihEGKlQUX5GmDIGG8DqE3iAPJ-xCY-OBQ'
## Refresh the token

TOKENURL=f'https://{wallarooPrefix}keycloak.{wallarooSuffix}/auth/realms/master/protocol/openid-connect/token'

# Retrieving through os environmental variables 
f = open('./creds.json')
login_data = json.load(f)

CONFIDENTIAL_CLIENT=login_data["confidentialClient"]
CONFIDENTIAL_CLIENT_SECRET=login_data["confidentialPassword"]

auth = HTTPBasicAuth(CONFIDENTIAL_CLIENT, CONFIDENTIAL_CLIENT_SECRET)
data = {
    'grant_type': 'refresh_token',
    'refresh_token': refresh_token
}
response = requests.post(TOKENURL, auth=auth, data=data, verify=True)
access_token = response.json()['access_token']
refresh_token = response.json()['refresh_token']
display(access_token)
'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJDYkFqN19QY0xCWTFkWmJiUDZ6Q3BsbkNBYTd6US0tRHlyNy0yLXlQb25nIn0.eyJleHAiOjE2ODQzNjAxNjcsImlhdCI6MTY4NDM1NjU2NywianRpIjoiZDJlNTNlNzEtYjYzMi00MzNmLThjY2UtOGIxMDI0ZjFmYzliIiwiaXNzIjoiaHR0cHM6Ly9kb2MtdGVzdC5rZXljbG9hay53YWxsYXJvb2NvbW11bml0eS5uaW5qYS9hdXRoL3JlYWxtcy9tYXN0ZXIiLCJhdWQiOlsibWFzdGVyLXJlYWxtIiwiYWNjb3VudCJdLCJzdWIiOiIwMjhjOGI0OC1jMzliLTQ1NzgtOTExMC0wYjViZGQzODI0ZGEiLCJ0eXAiOiJCZWFyZXIiLCJhenAiOiJhcGktY2xpZW50Iiwic2Vzc2lvbl9zdGF0ZSI6Ijk4MDhkZTA5LWU2NjYtNGIyNC05ZWQ4LTc2MmUxZjllODk0ZSIsImFjciI6IjEiLCJhbGxvd2VkLW9yaWdpbnMiOlsiKiJdLCJyZWFsbV9hY2Nlc3MiOnsicm9sZXMiOlsiZGVmYXVsdC1yb2xlcy1tYXN0ZXIiLCJvZmZsaW5lX2FjY2VzcyIsInVtYV9hdXRob3JpemF0aW9uIl19LCJyZXNvdXJjZV9hY2Nlc3MiOnsibWFzdGVyLXJlYWxtIjp7InJvbGVzIjpbIm1hbmFnZS11c2VycyIsInZpZXctdXNlcnMiLCJxdWVyeS1ncm91cHMiLCJxdWVyeS11c2VycyJdfSwiYWNjb3VudCI6eyJyb2xlcyI6WyJtYW5hZ2UtYWNjb3VudCIsIm1hbmFnZS1hY2NvdW50LWxpbmtzIiwidmlldy1wcm9maWxlIl19fSwic2NvcGUiOiJwcm9maWxlIGVtYWlsIiwic2lkIjoiOTgwOGRlMDktZTY2Ni00YjI0LTllZDgtNzYyZTFmOWU4OTRlIiwiZW1haWxfdmVyaWZpZWQiOmZhbHNlLCJodHRwczovL2hhc3VyYS5pby9qd3QvY2xhaW1zIjp7IngtaGFzdXJhLXVzZXItaWQiOiIwMjhjOGI0OC1jMzliLTQ1NzgtOTExMC0wYjViZGQzODI0ZGEiLCJ4LWhhc3VyYS1kZWZhdWx0LXJvbGUiOiJ1c2VyIiwieC1oYXN1cmEtYWxsb3dlZC1yb2xlcyI6WyJ1c2VyIl0sIngtaGFzdXJhLXVzZXItZ3JvdXBzIjoie30ifSwibmFtZSI6IkpvaG4gSGFuc2FyaWNrIiwicHJlZmVycmVkX3VzZXJuYW1lIjoiam9obi5odW1tZWxAd2FsbGFyb28uYWkiLCJnaXZlbl9uYW1lIjoiSm9obiIsImZhbWlseV9uYW1lIjoiSGFuc2FyaWNrIiwiZW1haWwiOiJqb2huLmh1bW1lbEB3YWxsYXJvby5haSJ9.Va5OiLkedj-mFuI7UhsxXshmaSbhXthLv-PU56f2JiQi5-wiWXXxRk3pUioavIzKbi-VjmYQfbR95VY5QSWpUuD3scPuRhHDkSeslz6390phYiFygK_PmXMviQnL2q1mwdGzwh69htOjUWLf7MGWjNmkNdzjYyBy8gfD3V2O2MCfN3onVVCqr1aA1aAQXe9y_JswhjooxAQGit1xzNicvm3IW3QhHtOrDKj7gXNuSlc5vKqe52RQYEgElltqIOV4PVe12UGthKMfvdlDIeUEpTzXVFRH8XHJCrO_YQ_W9m-Rt1_9kelBl3SksdYKOisZaGwo6lv7hhapembH0iD29Q'
## Pipeline Inference URL token - does not require Wallaroo username/password.

TOKENURL=f'https://{wallarooPrefix}keycloak.{wallarooSuffix}/auth/realms/master/protocol/openid-connect/token'

# Retrieving through os environmental variables 
f = open('./creds.json')
login_data = json.load(f)

CONFIDENTIAL_CLIENT=login_data["confidentialClient"]
CLIENT_SECRET=login_data["confidentialPassword"]

auth = HTTPBasicAuth(CONFIDENTIAL_CLIENT, CLIENT_SECRET)
data = {
    'grant_type': 'client_credentials'
}
response = requests.post(TOKENURL, auth=auth, data=data, verify=True)
inference_access_token = response.json()['access_token']
display(inference_access_token)
'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJDYkFqN19QY0xCWTFkWmJiUDZ6Q3BsbkNBYTd6US0tRHlyNy0yLXlQb25nIn0.eyJleHAiOjE2ODQzNjAxNjgsImlhdCI6MTY4NDM1NjU2OCwianRpIjoiNjIyOTViYmUtYzVlMi00NDQ2LWFmMDctNzY5MDAwNmI2NzI3IiwiaXNzIjoiaHR0cHM6Ly9kb2MtdGVzdC5rZXljbG9hay53YWxsYXJvb2NvbW11bml0eS5uaW5qYS9hdXRoL3JlYWxtcy9tYXN0ZXIiLCJhdWQiOlsibWFzdGVyLXJlYWxtIiwiYWNjb3VudCJdLCJzdWIiOiJmNmI5ODg5NC1iZTVjLTQyZDUtYTZhNS02ZjE5ZTY1YmNiNGEiLCJ0eXAiOiJCZWFyZXIiLCJhenAiOiJhcGktY2xpZW50IiwiYWNyIjoiMSIsImFsbG93ZWQtb3JpZ2lucyI6WyIqIl0sInJlYWxtX2FjY2VzcyI6eyJyb2xlcyI6WyJkZWZhdWx0LXJvbGVzLW1hc3RlciIsIm9mZmxpbmVfYWNjZXNzIiwidW1hX2F1dGhvcml6YXRpb24iXX0sInJlc291cmNlX2FjY2VzcyI6eyJtYXN0ZXItcmVhbG0iOnsicm9sZXMiOlsiaW1wZXJzb25hdGlvbiIsIm1hbmFnZS11c2VycyIsInZpZXctdXNlcnMiLCJxdWVyeS1ncm91cHMiLCJxdWVyeS11c2VycyJdfSwiYWNjb3VudCI6eyJyb2xlcyI6WyJtYW5hZ2UtYWNjb3VudCIsIm1hbmFnZS1hY2NvdW50LWxpbmtzIiwidmlldy1wcm9maWxlIl19fSwic2NvcGUiOiJwcm9maWxlIGVtYWlsIiwiZW1haWxfdmVyaWZpZWQiOmZhbHNlLCJjbGllbnRJZCI6ImFwaS1jbGllbnQiLCJjbGllbnRIb3N0IjoiMTAuMjQ0LjEuNzQiLCJodHRwczovL2hhc3VyYS5pby9qd3QvY2xhaW1zIjp7IngtaGFzdXJhLXVzZXItaWQiOiJmNmI5ODg5NC1iZTVjLTQyZDUtYTZhNS02ZjE5ZTY1YmNiNGEiLCJ4LWhhc3VyYS1kZWZhdWx0LXJvbGUiOiJ1c2VyIiwieC1oYXN1cmEtYWxsb3dlZC1yb2xlcyI6WyJ1c2VyIl0sIngtaGFzdXJhLXVzZXItZ3JvdXBzIjoie30ifSwicHJlZmVycmVkX3VzZXJuYW1lIjoic2VydmljZS1hY2NvdW50LWFwaS1jbGllbnQiLCJjbGllbnRBZGRyZXNzIjoiMTAuMjQ0LjEuNzQifQ.fde1-NsmXqCen71sRcIarscK1j4oFGATf8jh834aAUSb_UGXEmxEnUDDGMegu7KmpbeOi2ogIGY0ndACaZqS21lvVpzWHyVdsQGXCtl1mjwgLt0kzq6U5uR8znMIV-2Babw-9eE65F9I3TdUKRlnh8J5SAPvbOj_Hv_Y3u4cNj1b_Hk_o9lAEg-m2V0ZL7UDxgnVyitbWChiP4DE3q6yBBSVoORiBXDrfUiwIpCXyVKJIO_HrowEA8bYVOhh8PcbywmVa1kZaPcMuAOzsaysE361NCvJqbikVf4KX5Ii9k7lk90v3c-9VX24bIC67HFG8TwvWVnKRBAawwXcQ2ZTIA'

Through the Wallaroo SDK

The Wallaroo SDK method Wallaroo Client wl.auth.auth_header() method provides the token with the Authorization header.

# Retrieve the token
headers = wl.auth.auth_header()
display(headers)

{'Authorization': 'Bearer abcdefg'}

Connect to Wallaroo

For this example, a connection to the Wallaroo SDK is used. This will be used to retrieve the JWT token for the MLOps API calls.

This example will store the user’s credentials into the file ./creds.json which contains the following:

{
    "username": "{Connecting User's Username}", 
    "password": "{Connecting User's Password}", 
    "email": "{Connecting User's Email Address}"
}

Replace the username, password, and email fields with the user account connecting to the Wallaroo instance. This allows a seamless connection to the Wallaroo instance and bypasses the standard browser based confirmation link. For more information, see the Wallaroo SDK Essentials Guide: Client Connection.

For wallarooPrefix = "YOUR PREFIX." and wallarooSuffix = "YOUR SUFFIX", enter the prefix and suffix for your Wallaroo instance DNS name. If the prefix instance is blank, then it can be wallarooPrefix = "". Note that the prefix includes the . for proper formatting.

# Retrieve the login credentials.
os.environ["WALLAROO_SDK_CREDENTIALS"] = './creds.json'

# Client connection from local Wallaroo instance

wallarooPrefix = "YOUR PREFIX."
wallarooSuffix = "YOUR SUFFIX"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                     auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                     auth_type="user_password")

API URL

The variable APIURL is used to specify the connection to the Wallaroo instance’s MLOps API URL.

APIURL=f"https://{wallarooPrefix}api.{wallarooSuffix}"

API Request Methods

This tutorial relies on the Python requests library, and the Wallaroo Wallaroo Client wl.auth.auth_header() method.

MLOps API requests are always POST. Most are submitted with the header 'Content-Type':'application/json' unless specified otherwise.

1 - Wallaroo MLOps API Essentials Guide: User Management

How to use the Wallaroo API for User Management

Users

Get Users

Users can be retrieved either by their Keycloak user id, or return all users if an empty set {} is submitted.

  • Parameters
    • {}: Empty set, returns all users.
    • user_ids Array[Keycloak user ids]: An array of Keycloak user ids, typically in UUID format.

Example: The first example will submit an empty set {} to return all users, then submit the first user’s user id and request only that user’s details.

# Get all users

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/users/query"
data = {
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'users': {'028c8b48-c39b-4578-9110-0b5bdd3824da': {'access': {'manageGroupMembership': True,
    'impersonate': False,
    'view': True,
    'manage': True,
    'mapRoles': True},
   'createdTimestamp': 1684355671859,
   'disableableCredentialTypes': [],
   'email': 'john.hummel@wallaroo.ai',
   'emailVerified': False,
   'enabled': True,
   'firstName': 'John',
   'id': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'lastName': 'Hansarick',
   'notBefore': 0,
   'requiredActions': [],
   'username': 'john.hummel@wallaroo.ai'},
  'de777519-2963-423a-92d2-e6e26d687527': {'access': {'manage': True,
    'manageGroupMembership': True,
    'mapRoles': True,
    'impersonate': False,
    'view': True},
   'createdTimestamp': 1684355337295,
   'disableableCredentialTypes': [],
   'emailVerified': False,
   'enabled': True,
   'id': 'de777519-2963-423a-92d2-e6e26d687527',
   'notBefore': 0,
   'requiredActions': [],
   'username': 'admin'}}}
# Get first user Keycloak id
# Retrieve the token 
headers = wl.auth.auth_header()

# retrieved from the previous request
first_user_keycloak = list(response['users'])[0]

api_request = f"{APIURL}/v1/api/users/query"

data = {
  "user_ids": [
    first_user_keycloak
  ]
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'users': {'028c8b48-c39b-4578-9110-0b5bdd3824da': {'access': {'mapRoles': True,
    'view': True,
    'manage': True,
    'manageGroupMembership': True,
    'impersonate': False},
   'createdTimestamp': 1684355671859,
   'disableableCredentialTypes': [],
   'email': 'john.hummel@wallaroo.ai',
   'emailVerified': False,
   'enabled': True,
   'federatedIdentities': [{'identityProvider': 'google',
     'userId': '117610299312093432527',
     'userName': 'john.hummel@wallaroo.ai'}],
   'firstName': 'John',
   'id': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'lastName': 'Hansarick',
   'notBefore': 0,
   'requiredActions': [],
   'username': 'john.hummel@wallaroo.ai'}}}

Invite Users

IMPORTANT NOTE: This command is for YOUR SUFFIX only. For more details on user management, see Wallaroo User Management.

Users are invited through /users/invite. When using YOUR SUFFIX, this will send an invitation email to the email address listed. Note that the user must not already be a member of the Wallaroo instance, and email addresses must be unique. If the email address is already in use for another user, the request will generate an error.

  • Parameters
    • email *(REQUIRED string): The email address of the new user to invite.
    • password (OPTIONAL string): The assigned password of the new user to invite. If not provided, the Wallaroo instance will provide the new user a temporary password that must be changed upon initial login.

Example: In this example, a new user will be invited to the Wallaroo instance and assigned a password.

# invite users

Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/users/invite"

data = {
    "email": new_user,
    "password":new_user_password
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response

Deactivate User

Users can be deactivated so they can not login to their Wallaroo instance. Deactivated users do not count against the Wallaroo license count.

  • Parameters
    • email (REQUIRED string): The email address of the user to deactivate.

Example: In this example, the deactivated_user will be deactivated.

## Deactivate users

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/users/deactivate"

data = {
    "email": new_user
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response

Activate User

A deactivated user can be reactivated to allow them access to their Wallaroo instance. Activated users count against the Wallaroo license count.

  • Parameters
    • email (REQUIRED string): The email address of the user to activate.

Example: In this example, the activated_user will be activated.

## Activate users

# Retrieve the token 
headers = wl.auth.auth_header()
api_request = f"{APIURL}/v1/api/users/activate"

data = {
    "email": new_user
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response

Troubleshooting

When a new user logs in for the first time, they get an error when uploading a model or issues when they attempt to log in. How do I correct that?

When a new registered user attempts to upload a model, they may see the following error:

TransportQueryError: 
{'extensions': 
    {'path': 
        '$.selectionSet.insert_workspace_one.args.object[0]', 'code': 'not-supported'
    }, 
    'message': 
        'cannot proceed to insert array relations since insert to table "workspace" affects zero rows'

Or if they log into the Wallaroo Dashboard, they may see a Page not found error.

This is caused when a user has been registered without an appropriate email address. See the user guides here on inviting a user, or the Wallaroo Enterprise User Management on how to log into the Keycloak service and update users. Verify that the username and email address are both the same, and they are valid confirmed email addresses for the user.

2 - Wallaroo MLOps API Essentials Guide: Workspace Management

How to use the Wallaroo API for Workspace Management

Workspace Naming Requirements

Workspace names map onto Kubernetes objects, and must be DNS compliant. Workspace names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Workspaces

List Workspaces

List the workspaces for a specific user.

  • Parameters
    • user_id - (OPTIONAL string): The Keycloak ID.

Example: In this example, the workspaces for all users will be displayed.

# List workspaces

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/workspaces/list"

data = {
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'workspaces': [{'id': 1,
   'name': 'john.hummel@wallaroo.ai - Default Workspace',
   'created_at': '2023-05-17T20:36:36.312003+00:00',
   'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'archived': False,
   'models': [],
   'pipelines': []},
  {'id': 5,
   'name': 'housepricedrift',
   'created_at': '2023-05-17T20:41:50.351766+00:00',
   'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'archived': False,
   'models': [1],
   'pipelines': [1]},
  {'id': 6,
   'name': 'sdkquickworkspace',
   'created_at': '2023-05-17T20:43:36.727099+00:00',
   'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'archived': False,
   'models': [2, 3],
   'pipelines': [3]}]}

Create Workspace

A new workspace will be created in the Wallaroo instance. Upon creating, the workspace owner will be assigned as the user making the MLOps API request.

  • Parameters:
    • workspace_name - (REQUIRED string): The name of the new workspace with the following requirements:
      • Must be unique.
      • DNS compliant with only lowercase characters.
  • Returns:
    • workspace_id - (int): The ID of the new workspace.

Example: In this example, a workspace with the name testapiworkspace will be created, and the newly created workspace’s workspace_id saved as the variable example_workspace_id for use in other code examples. After the request is complete, the List Workspaces command will be issued to demonstrate the new workspace has been created.

# Create workspace
# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/workspaces/create"

data = {
  "workspace_name": example_workspace_name
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
display(response)
# Stored for future examples
example_workspace_id = response['workspace_id']
{'workspace_id': 7}
## List workspaces

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/workspaces/list"

data = {
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'workspaces': [{'id': 1,
   'name': 'john.hummel@wallaroo.ai - Default Workspace',
   'created_at': '2023-05-17T20:36:36.312003+00:00',
   'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'archived': False,
   'models': [],
   'pipelines': []},
  {'id': 5,
   'name': 'housepricedrift',
   'created_at': '2023-05-17T20:41:50.351766+00:00',
   'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'archived': False,
   'models': [1],
   'pipelines': [1]},
  {'id': 6,
   'name': 'sdkquickworkspace',
   'created_at': '2023-05-17T20:43:36.727099+00:00',
   'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'archived': False,
   'models': [2, 3],
   'pipelines': [3]},
  {'id': 7,
   'name': 'apiworkspaces',
   'created_at': '2023-05-17T20:50:36.298217+00:00',
   'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'archived': False,
   'models': [],
   'pipelines': []}]}

Add User to Workspace

Existing users of the Wallaroo instance can be added to an existing workspace.

  • Parameters
    • email - (REQUIRED string): The email address of the user to add to the workspace. This user must already exist in the Wallaroo instance.
    • workspace_id - (REQUIRED int): The id of the workspace.

Example: The following example adds the user created in Invite Users request to the workspace created in the Create Workspace request.

# Add existing user to existing workspace

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/workspaces/add_user"

data = {
  "email":new_user,
  "workspace_id": example_workspace_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{}

List Users in a Workspace

Lists the users who are either owners or collaborators of a workspace.

  • Parameters
    • workspace_id - (REQUIRED int): The id of the workspace.
  • Returns
    • user_id: The user’s Keycloak identification.
    • user_type: The user’s workspace type (owner, co-owner, etc).

Example: The following example will list all users part of the workspace created in the Create Workspace request.

# List users in a workspace

# Retrieve the token 

headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/workspaces/list_users"

data = {
  "workspace_id": example_workspace_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'users': [{'user_id': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'user_type': 'OWNER'},
  {'user_id': 'c64d26bc-5d30-4d0d-9ae9-9c0089bc4b80',
   'user_type': 'COLLABORATOR'}]}

Remove User from a Workspace

Removes the user from the given workspace. In this request, either the user’s Keycloak ID is required OR the user’s email address is required.

  • Parameters
    • workspace_id - (REQUIRED int): The id of the workspace.
    • user_id - (string): The Keycloak ID of the user. If email is not provided, then this parameter is REQUIRED.
    • email - (string): The user’s email address. If user_id is not provided, then this parameter is REQUIRED.
  • Returns
    • user_id: The user’s identification.
    • user_type: The user’s workspace type (owner, co-owner, etc).

Example: The following example will remove the newUser from workspace created in the Create Workspace request. Then the users for that workspace will be listed to verify newUser has been removed.

# Remove existing user from an existing workspace

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/workspaces/remove_user"

data = {
  "email":new_user,
  "workspace_id": example_workspace_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'affected_rows': 1}
## List users in a workspace

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/users/query"

data = {
  "workspace_id": example_workspace_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'users': {'de777519-2963-423a-92d2-e6e26d687527': {'access': {'mapRoles': True,
    'view': True,
    'manageGroupMembership': True,
    'impersonate': False,
    'manage': True},
   'createdTimestamp': 1684355337295,
   'disableableCredentialTypes': [],
   'emailVerified': False,
   'enabled': True,
   'id': 'de777519-2963-423a-92d2-e6e26d687527',
   'notBefore': 0,
   'requiredActions': [],
   'username': 'admin'},
  'c64d26bc-5d30-4d0d-9ae9-9c0089bc4b80': {'access': {'manage': True,
    'mapRoles': True,
    'manageGroupMembership': True,
    'impersonate': False,
    'view': True},
   'createdTimestamp': 1684356685177,
   'disableableCredentialTypes': [],
   'email': 'john.hansarick@wallaroo.ai',
   'emailVerified': True,
   'enabled': True,
   'firstName': 'John',
   'id': 'c64d26bc-5d30-4d0d-9ae9-9c0089bc4b80',
   'lastName': 'Hansarick',
   'notBefore': 0,
   'requiredActions': [],
   'username': 'john.hansarick@wallaroo.ai'},
  '028c8b48-c39b-4578-9110-0b5bdd3824da': {'access': {'mapRoles': True,
    'view': True,
    'impersonate': False,
    'manage': True,
    'manageGroupMembership': True},
   'createdTimestamp': 1684355671859,
   'disableableCredentialTypes': [],
   'email': 'john.hummel@wallaroo.ai',
   'emailVerified': False,
   'enabled': True,
   'firstName': 'John',
   'id': '028c8b48-c39b-4578-9110-0b5bdd3824da',
   'lastName': 'Hansarick',
   'notBefore': 0,
   'requiredActions': [],
   'username': 'john.hummel@wallaroo.ai'}}}

3 - Wallaroo MLOps API Essentials Guide: Model Management

How to use the Wallaroo API for Model Management

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Models

Upload Model to Workspace

ML Models are uploaded to Wallaroo through the following endpoint:

Models uploaded through this method that are not native runtimes are containerized within the Wallaroo instance then run by the Wallaroo engine. See Wallaroo MLOps API Essentials Guide: Pipeline Management for details on pipeline configurations and deployments.

For these models, the following inputs are required.

  • Endpoint:
    • /v1/api/models/upload_and_convert
  • Headers:
    • Content-Type: multipart/form-data
  • Parameters
    • name (String Required): The model name.
    • visibility (String Required): Either public or private.
    • workspace_id (String Required): The numerical ID of the workspace to upload the model to.
    • conversion (String Required): The conversion parameters that include the following:
      • framework (String Required): The framework of the model being uploaded. See the list of supported models for more details.
      • python_version (String Required): The version of Python required for model.
      • requirements (String Required): Required libraries. Can be [] if the requirements are default Wallaroo JupyterHub libraries.
      • input_schema (String Optional): The input schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.
      • output_schema (String Optional): The output schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.

Upload Native Runtime Model Example

ONNX are always native runtimes. The following example shows uploading an ONNX model to a Wallaroo instance using the requests library. Note that the input_schema and output_schema encoded details are not required.

 authorization header
headers = {'Authorization': 'Bearer abcdefg'}

apiRequest = f"{APIURL}/v1/api/models/upload_and_convert"

framework='onnx'

model_name = f"{suffix}ccfraud"

data = {
    "name": model_name,
    "visibility": "public",
    "workspace_id": workspaceId,
    "conversion": {
        "framework": framework,
        "python_version": "3.8",
        "requirements": []
    }
}

files = {
    "metadata": (None, json.dumps(data), "application/json"),
    'file': (model_name, open('./ccfraud.onnx', 'rb'), "application/octet-stream")
    }

response = requests.post(apiRequest, files=files, headers=headers).json()
'Sample model name: apimodel'

‘Sample model file: ./models/ccfraud.onnx’

{‘insert_models’: {‘returning’: [{‘models’: [{‘id’: 4}]}]}}

Upload Converted Model Examples

The following example shows uploading a Hugging Face model to a Wallaroo instance using the requests library. Note that the input_schema and output_schema encoded details are required.

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

encoded_input_schema = base64.b64encode(
                bytes(input_schema.serialize())
            ).decode("utf8")

encoded_output_schema = base64.b64encode(
                bytes(output_schema.serialize())
            ).decode("utf8")

metadata = {
    "name": model_name,
    "visibility": "private",
    "workspace_id": workspace_id,
    "conversion": {
        "framework": framework,
        "python_version": "3.8",
        "requirements": []
    },
    "input_schema": encoded_input_schema,
    "output_schema": encoded_output_schema,
}

headers = wl.auth.auth_header()

files = {
    'metadata': (None, json.dumps(metadata), "application/json"),
    'file': (model_name, open(model_path,'rb'),'application/octet-stream')
}

response = requests.post('https://{APIURL}/v1/api/models/upload_and_convert', 
                         headers=headers, 
                         files=files).json()

Stream Upload Model to Workspace

Streams a potentially large ML Model to a Wallaroo workspace via POST with Content-Type: multipart/form-data.

  • Parameters
    • name - (REQUIRED string): Name of the model. Must only include alphanumeric characters.
    • filename - (REQUIRED string): Name of the file being uploaded.
    • visibility - (OPTIONAL string): The visibility of the model as either public or private.
    • workspace_id - (REQUIRED int): The numerical id of the workspace to upload the model to.

Example: This example will upload the sample file ccfraud.onnx to the workspace created in the Create Workspace step as apitestmodel.

# stream upload model - next test is adding arbitrary chunks to the stream

# Retrieve the token 
headers = wl.auth.auth_header()

# Set the contentType
headers['contentType']='application/octet-stream'

api_request = f"{APIURL}/v1/api/models/upload_stream"

# Model name and file to use
display(f"Sample stream model name: {stream_model_name}")
display(f"Sample model file: {stream_model_file_name}")

data = {
    "name":stream_model_name,
    "filename": stream_model_file_name,
    "visibility":"public",
    "workspace_id": example_workspace_id
}

files = {
    'file': (stream_model_name, open(stream_model_file_name, 'rb'))
    }

response = requests.post(apiRequest, files=files, data=data, headers=headers).json()
response
'Sample stream model name: apiteststreammodel'

‘Sample model file: ./models/ccfraud.onnx’

{‘insert_models’: {‘returning’: [{‘models’: [{‘id’: 5}]}]}}

List Models in Workspace

Returns a list of models added to a specific workspace.

  • Parameters
    • workspace_id - (REQUIRED int): The workspace id to list.

Example: Display the models for the workspace used in the Upload Model to Workspace step. The model id and model name will be saved as example_model_id and exampleModelName variables for other examples.

# List models in a workspace
# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/models/list"

data = {
  "workspace_id": example_workspace_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'models': [{'id': 5,
   'name': 'apiteststreammodel',
   'owner_id': '""',
   'created_at': '2023-05-17T20:51:53.077997+00:00',
   'updated_at': '2023-05-17T20:51:53.077997+00:00'},
  {'id': 4,
   'name': 'apimodel',
   'owner_id': '""',
   'created_at': '2023-05-17T20:51:51.092416+00:00',
   'updated_at': '2023-05-17T20:51:51.092416+00:00'}]}
model = next(model for model in response["models"] if model["name"] == "apimodel")
example_model_id = model['id']

Get Model Details By ID

Returns the model details by the specific model id.

  • Parameters
    • workspace_id - (REQUIRED int): The workspace id to list.
  • Returns
    • id - (int): Numerical id of the model.
    • owner_id - (string): Id of the owner of the model.
    • workspace_id - (int): Numerical of the id the model is in.
    • name - (string): Name of the model.
    • updated_at - (DateTime): Date and time of the model’s last update.
    • created_at - (DateTime): Date and time of the model’s creation.
    • model_config - (string): Details of the model’s configuration.

Example: Retrieve the details for the model uploaded in the Upload Model to Workspace step.

# Get model details by id
# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/models/get_by_id"

data = {
  "id": example_model_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'id': 4,
 'owner_id': '""',
 'workspace_id': 7,
 'name': 'apimodel',
 'updated_at': '2023-05-17T20:51:51.092416+00:00',
 'created_at': '2023-05-17T20:51:51.092416+00:00',
 'model_config': None}

Get Model Versions

Retrieves all versions of a model based on either the name of the model or the model_pk_id.

  • Parameters
    • model_id - (REQUIRED String): The model name.
    • models_pk_id - (REQUIRED int): The model integer pk id.
  • Returns
    • Array(Model Details)
      • sha - (String): The sha hash of the model version.
      • models_pk_id- (int): The pk id of the model.
      • model_version - (String): The UUID identifier of the model version.
      • owner_id - (String): The Keycloak user id of the model’s owner.
      • model_id - (String): The name of the model.
      • id - (int): The integer id of the model.
      • file_name - (String): The filename used when uploading the model.
      • image_path - (String): The image path of the model.

Example: Retrieve the versions for a previously uploaded model. The variables example_model_version and example_model_sha will store the model’s version and SHA values for use in other examples.

## List model versions

# Retrieve the token 
headers = wl.auth.auth_header()
api_request = f"{APIURL}/v1/api/models/list_versions"

data = {
  "model_id": model_name,
  "models_pk_id": example_model_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
[{'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
  'models_pk_id': 4,
  'model_version': '0b989008-5f1d-453e-8085-98d97be1b722',
  'owner_id': '""',
  'model_id': 'apimodel',
  'id': 4,
  'file_name': 'apimodel',
  'image_path': None,
  'status': 'ready'}]
# Stored for future examples

example_model_version = response[-1]['model_version']
example_model_sha = response[-1]['sha']

Get Model Configuration by Id

Returns the model’s configuration details.

  • Parameters
    • model_id - (REQUIRED int): The numerical value of the model’s id.

Example: Submit the model id for the model uploaded in the Upload Model to Workspace step to retrieve configuration details.

## Get model config by id

# Retrieve the token 
headers = wl.auth.auth_header()
api_request = f"{APIURL}/v1/api/models/get_config_by_id"

data = {
  "model_id": example_model_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'model_config': None}

Get Model Details

Returns details regarding a single model, including versions.

Returns the model’s configuration details.

  • Parameters
    • model_id - (REQUIRED int): The numerical value of the model’s id.

Example: Submit the model id for the model uploaded in the Upload Model to Workspace step to retrieve configuration details.

# Get model config by id
# Retrieve the token 
headers = wl.auth.auth_header()
api_request = f"{APIURL}/v1/api/models/get"

data = {
  "id": example_model_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'id': 4,
 'name': 'apimodel',
 'owner_id': '""',
 'created_at': '2023-05-17T20:51:51.092416+00:00',
 'updated_at': '2023-05-17T20:51:51.092416+00:00',
 'models': [{'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
   'models_pk_id': 4,
   'model_version': '0b989008-5f1d-453e-8085-98d97be1b722',
   'owner_id': '""',
   'model_id': 'apimodel',
   'id': 4,
   'file_name': 'apimodel',
   'image_path': None}]}

4 - Wallaroo MLOps API Essentials Guide: Model Registry

How to use the Wallaroo API for Model Registry aka Artifact Registries

Wallaroo users can register their trained machine learning models from a model registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

This guide details how to add ML Models from a model registry service into a Wallaroo instance.

Artifact Requirements

Models are uploaded to the Wallaroo instance as the specific artifact - the “file” or other data that represents the file itself. This must comply with the Wallaroo model requirements framework and version or it will not be deployed. Note that for models that fall outside of the supported model types, they can be registered to a Wallaroo workspace as MLFlow 1.30.0 containerized models.

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

Registry Services Roles

Registry service use in Wallaroo typically falls under the following roles.

RoleRecommended ActionsDescription
DevOps EngineerCreate Model RegistryCreate the model (AKA artifact) registry service
 Retrieve Model Registry TokensGenerate the model registry service credentials.
MLOps EngineerConnect Model Registry to WallarooAdd the Registry Service URL and credentials into a Wallaroo instance for use by other users and scripts.
 Add Wallaroo Registry Service to WorkspaceAdd the registry service configuration to a Wallaroo workspace for use by workspace users.
 Get Registry DetailsRetrieve the connection details for a Wallaroo registry.
 Remove Wallaroo Registry from a WorkspaceAdd the registry service configuration to a Wallaroo workspace for use by workspace users.
Data ScientistList Models in RegistryList available models in a model registry.
 List Model Version ArtifactsRetrieve the artifacts (usually files) for a model stored in a model registry.
 Upload Model from RegistryUpload a model and artifacts stored in a model registry into a Wallaroo workspace.

Model Registry Operations

The following links to guides and information on setting up a model registry (also known as an artifact registry).

Create Model Registry

See Model serving with Azure Databricks for setting up a model registry service using Azure Databricks.

The following steps create an Access Token used to authenticate to an Azure Databricks Model Registry.

  1. Log into the Azure Databricks workspace.
  2. From the upper right corner access the User Settings.
  3. From the Access tokens, select Generate new token.
  4. Specify any token description and lifetime. Once complete, select Generate.
  5. Copy the token and store in a secure place. Once the Generate New Token module is closed, the token will not be retrievable.
Retrieve Azure Databricks User Token

The MLflow Model Registry provides a method of setting up a model registry service. Full details can be found at the MLflow Registry Guides.

A generic MLFlow model registry requires no token.

Wallaroo Registry Operations

  • Connect Model Registry to Wallaroo: This details the link and connection information to a existing MLFlow registry service. Note that this does not create a MLFlow registry service, but adds the connection and credentials to Wallaroo to allow that MLFlow registry service to be used by other entities in the Wallaroo instance.
  • Add a Registry to a Workspace: Add the created Wallaroo Model Registry so make it available to other workspace members.
  • Remove a Registry from a Workspace: Remove the link between a Wallaroo Model Registry and a Wallaroo workspace.

Connect Model Registry to Wallaroo

MLFlow Registry connection information is added to a Wallaroo instance through the following endpoint.

  • REQUEST URL
    • v1/api/models/create_registry
  • PARAMETERS
    • workspace_id (Integer Required): The numerical ID of the workspace to create the registry in.
    • name (String Required): The name for the registry. Registry names are not unique.
    • url (String Required): The full URL of the registry service. For example: https://registry.wallaroo.ai
    • token (String Required): The authentication token used by the registry service.
  • RETURNS
    • id (String): The UUID of the registry.
    • workspace_id: The numerical ID of the workspace the registry was connected to.

Connect Model Registry to Wallaroo Example

The following registry will be added to the workspace with the id 1.

import requests
token = "abcdefg"

x = requests.post("https://{APIURL}/v1/api/models/create_registry", 
                    json={
                        "workspace_id": 1, 
                        "name": "sample registry", 
                        "url": "https://registry.wallaroo.ai", 
                        "token": token
                        }, 
                        headers=wl.auth.auth_header()
                    )

{'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29', 'workspace_id': 1}

Add Wallaroo Registry Service to Workspace

Registries are assigned to a Wallaroo workspace with the following endpoint. This allows members of the workspace to access the registry connection. A registry can be associated with one or more workspaces.

  • REQUEST URL
    • v1/api/models//v1/api/models/attach_registry_to_workspace
  • PARAMETERS
    • workspace_id (Integer Required): The numerical ID of the workspace to create the registry in.
    • registry_id (String Required): The ID of the registry in UUID format.
  • RETURNS
    • id (String): The UUID of the registry.
    • workspace_id: The numerical ID of the workspace the registry was connected to.

Add Wallaroo Registry Service to Workspace Example

import requests
token = "abcdefg"

x = requests.post("https://{APIURL}/v1/api/models/attach_registry_to_workspace", 
                    json={
                      'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29', 
                      'workspace_id': 1},
                        headers=wl.auth.auth_header()
                    )

{
  "registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "workspace_id": 0
}

Remove Wallaroo Registry from a Workspace

Registries are removed from a registry with the following endpoint. This does not remove the registry connection information from the Wallaroo instance, merely removes the association between the registry and that particular workspace.

  • REQUEST URL
    • v1/api/models//v1/api/models/remove_registry_from_workspace
  • PARAMETERS
    • workspace_id (Integer Required): The numerical ID of the workspace to remove the registry from.
    • registry_id (String Required): The ID of the registry in UUID format.
  • RETURNS
    • id (String): The UUID of the registry.
    • workspace_id: The numerical ID of the workspace the registry was connected to.

Remove Wallaroo Registry from a Workspace Example

import requests
token = "abcdefg"

x = requests.post("https://{APIURL}/v1/api/models/remove_registry_from_workspace", 
                    json={
                      'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29', 
                      'workspace_id': 1
                      },
                        headers=wl.auth.auth_header()
                    )

{
  "registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "workspace_id": 0
}

Get Registry Details

  • REQUEST URL
    • v1/api/models/get_registry
  • PARAMETERS
    • id (String Required): The registry ID in UUID format.
  • RETURNS
    • id (String): The UUID of the registry.
    • name (String Required): The name for the registry. Registry names are not unique.
    • url (String Required): The full URL of the registry service. For example: https://registry.wallaroo.ai
    • token (String Required): The authentication token used by the registry service.
    • created_at (String Required): The creation date in DateTime format.
    • updated_at (String Required): The updated date in DateTime format.

Get Registry Details Example

The following example demonstrates retrieving the registry with id 98f9ca1d-c4e7-4d70-8df4-05c25a64be29.

import requests
x = requests.post("https://{APIURL}/v1/api/models/get_registry", 
                    json={
                        "registry_id": '98f9ca1d-c4e7-4d70-8df4-05c25a64be29'
                        }, 
                    headers=wl.auth.auth_header()
                )
{
    'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29',
    'name': 'sample registry',
    'url': 'https://registry.wallaroo.ai',
    'token': 'dapi67c8c0b04606f730e78b7ae5e3221015-3',
    'created_at': '2023-06-23T15:37:38.38427+00:00',
    'updated_at': '2023-06-23T15:37:38.38427+00:00'
}

Wallaroo Registry Model Operations

List Registries in Workspace

A list of registries associates with a specific workspace are retrieved from the following endpoint.

  • REQUEST URL
    • POST /v1/api/models/list_registries
  • PARAMETERS
    • workspace-id (String Required): The numerical id of the workspace.
  • RETURNS
    • A list of registries with the following fields.
      • created_at (String): The UTC DateTime stamp of when the registry was created.
      • id (String): The id of the Wallaroo registry in UUID format.
      • name (String): The assigned name of the registry.
      • token (String): The authentication token to the model registry.
      • updated_at (String): The UTC DateTime stamp of when the registry was last updated.
      • url (String): The URL to the registry service.

List Registries in Workspace Example

import requests
x = requests.post("{APIURL}v1/api/models/list_registries", 
                    json={
                      "workspace_id": 1
                      }, 
                      headers=wl.auth.auth_header()
                      )
x.json()

[
  {
    "created_at": "2023-07-11T15:21:24.403Z",
    "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
    "name": "sample registry",
    "token": "string",
    "updated_at": "2023-07-11T15:21:24.403Z",
    "url": "string"
  }
]

List Models in a Registry

A List of models available to the Wallaroo instance through the MLFlow Registry is performed with the following endpoint.

  • REQUEST URL
    • v1/api/models/list_registry_models
  • PARAMETERS
    • id (String Required): The registry ID in UUID format.
  • RETURNS
    • next_page_token (String): Used to retrieve the next List of models. If None, there are no additional models to list after this request.
    • registered_models (List): List of Models based on the mlflow.entities.model_registry.ModelVersion specification.
      • name (String): The name of the model.
      • user_id (String): The ID of the user.
      • latest_versions (List): The list of other versions of the model. If there are no other versions of the model, this will be None. Each version has the following fields.
        • name (String): The name of the artifact.
        • description (String): The description of the artifact.
        • version (Integer): The version number. Other versions may be removed from the registry, so it is possible for a version to be higher than 1 even if there are no other versions still stored in the registry.
        • status (String): The current status of the model.
        • run_id (String): The run id from the MLFlow service that created the model.
        • run_link (String): Link to the run from the MLFlow service that created the model.
        • source (String): The URL for the specific version artifact on this registry service. For example: 'dbfs:/databricks/mlflow-tracking/123456/abcdefg/artifacts/random_forest_model'.
        • current_stage (String): The current stage of the model version.
        • creation_timestamp (Integer): Creation timestamp in milliseconds since the Unix epoch.
        • last_updated_timestamp (Integer): Last updated timestamp in milliseconds since the Unix epoch.

List Models in a Registry Example

import requests
x = requests.post("{APIURL}v1/api/models/list_registry_models", 
                    json={
                      "registry_id": id
                      }, 
                      headers=wl.auth.auth_header()
                      )
x.json()

{
    'next_page_token': None,
    'registered_models': [
        {
            'name': 'testmodel', 
            'user_id': 'sample.usersj@wallaroo.ai', 
            'latest_versions': None, 
            'creation_timestamp': 1686940722329, 
            'last_updated_timestamp': 1686940722329
        },
        {
            'name': 'testmodel2', 
            'user_id': 'sample.user@wallaroo.ai', 
            'latest_versions': None, 
            'creation_timestamp': 1686940864528, 
            'last_updated_timestamp': 1686940864528
        },
        {
            'name': 'wine_quality', 
            'user_id': 'sample.user@wallaroo.ai', 
            'latest_versions': [
                {
                    'name': 'wine_quality', 
                    'description': None, 
                    'version': '1', 
                    'status': 'READY', 
                    'run_id': 'abcdefg', 
                    'run_link': None, 
                    'source': 'dbfs:/databricks/mlflow-tracking/abcdefg/abcdefg/artifacts/random_forest_model', 
                    'current_stage': 'Archived', 
                    'creation_timestamp': 1686942353367, 
                    'last_updated_timestamp': 1686942597509
                },
                {
                    'name': 'wine_quality', 
                    'description': None, 
                    'version': '2', 
                    'status': 'READY', 
                    'run_id': 'abcdefg', 
                    'run_link': None, 
                    'source': 'dbfs:/databricks/mlflow-tracking/abcdefg/abcdefg/artifacts/model', 
                    'current_stage': 'Production', 
                    'creation_timestamp': 1686942576120, 
                    'last_updated_timestamp': 1686942597646
                }
            ], 
            'creation_timestamp': 1686942353127, 
            'last_updated_timestamp': 1686942597646
        }
    ]
}

List Model Version Artifacts

The artifacts of a specific model version are retrieved through the following endpoint.

  • REQUEST URL
    • v1/api/models/list_registry_model_version_artifacts
  • PARAMETERS
    • registry_id (String Required): The registry ID in UUID format.
    • name (String Required): The name of the model to retrieve artifacts from.
    • version (String Required): The version of the model to retrieve artifacts from.
  • RETURNS
    • List of artifacts with the following fields.
      • file_size (Integer): The size of the file in bytes.
      • full_path (String): The full path to the artifact. For example: https://adb-5939996465837398.18.azuredatabricks.net/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
      • is_dir (Boolean): Whether the artifact is a directory or file.
      • modification_time (Integer): Last modification timestamp in milliseconds since the Unix epoch
      • path (String): Relative path to the artifact within the registry. For example: /databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl

List Model Version Artifacts Example

import requests
x = requests.post("{APIURL}v1/api/models//list_registry_model_version_artifacts", 
                    json={
                      "name": "wine_quality",
                      "registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
                      "version": "2"
                      }, 
                      headers=wl.auth.auth_header()
                      )
x.json()

[
  {
    "file_size": 156456,
    "full_path": "https://adb-5939996465837398.18.azuredatabricks.net/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl",
    "is_dir": false,
    "modification_time": 1686942597646,
    "path": "/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl"
  }
]

Upload Model from Registry

Models are uploaded from a model registry configured in Wallaroo through the following endpoint. The specific artifact that is the model to be deployed is the item to upload to the Wallaroo workspace. Models must comply with Wallaroo model framework and versions as defined in Artifact Requirements.

  • REQUEST URL
    • v1/api/models/upload_from_registry
  • PARAMETERS
    • registry_id (String Required): The registry ID in UUID format.
    • name (String Required): The name to assign the model in Wallaroo. Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.
    • path (String Required): URL of the model to upload as specified in List Models in a Registry source field.
    • visibility (String Required): Whether the model is public or private.
    • workspace_id (Integer Required): The numerical id of the workspace to upload the model to.
    • conversion (String Required): The conversion parameters that include the following:
      • framework (String Required): The framework of the model being uploaded. See the list of supported models for more details.
      • python_version (String Required): The version of Python required for model.
      • requirements (String Required): Required libraries. Can be [] if the requirements are default Wallaroo JupyterHub libraries.
      • input_schema (String Optional): The input schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.
      • output_schema (String Optional): The output schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.
  • RETURNS
    • model_id (String): The numerical id of the model uploaded

Upload Model from Registry Example

import requests
x = requests.post("{APIURL}v1/api/models/upload_from_registry", json={
    "registry_id": id, 
    "name": "uploaded-model-name",
    "path": "<DBFS URL from list_artifacts() call here>",
    "visibility": "public",
    "workspace_id": 1,
    "conversion": {
        "framework": "sklearn",
        "requirements": [],
    },
    "input_schema": "<base64-encoded input schema here>",
    "output_schema": "<base64-encoded output schema here>"
}, headers=wl.auth.auth_header())
x.json()

{'model_id': 34}

5 - Wallaroo MLOps API Essentials Guide: Model Upload and Registrations

How to use the Wallaroo API to upload models of different frameworks.

Models are uploaded or registered to a Wallaroo workspace depending on the model framework and type.

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

Upload Model

ML Models are uploaded to Wallaroo through the following endpoint:

Models uploaded through this method that are not native runtimes are containerized within the Wallaroo instance then run by the Wallaroo engine. See Wallaroo MLOps API Essentials Guide: Pipeline Management for details on pipeline configurations and deployments.

For these models, the following inputs are required.

  • Endpoint:
    • /v1/api/models/upload_and_convert
  • Headers:
    • Content-Type: multipart/form-data
  • Parameters
    • name (String Required): The model name.
    • visibility (String Required): Either public or private.
    • workspace_id (String Required): The numerical ID of the workspace to upload the model to.
    • conversion (String Required): The conversion parameters that include the following:
      • framework (String Required): The framework of the model being uploaded. See the list of supported models for more details.
      • python_version (String Required): The version of Python required for model.
      • requirements (String Required): Required libraries. Can be [] if the requirements are default Wallaroo JupyterHub libraries.
      • input_schema (String Optional): The input schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.
      • output_schema (String Optional): The output schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.

Upload Native Runtime Model Example

ONNX are always native runtimes. The following example shows uploading an ONNX model to a Wallaroo instance using the requests library. Note that the input_schema and output_schema encoded details are not required.

# authorization header
headers = {'Authorization': 'Bearer abcdefg'}

apiRequest = f"{APIURL}/v1/api/models/upload_and_convert"

framework='onnx'

model_name = f"{suffix}ccfraud"

data = {
    "name": model_name,
    "visibility": "public",
    "workspace_id": workspaceId,
    "conversion": {
        "framework": framework,
        "python_version": "3.8",
        "requirements": []
    }
}

files = {
    "metadata": (None, json.dumps(data), "application/json"),
    'file': (model_name, open('./ccfraud.onnx', 'rb'), "application/octet-stream")
    }


response = requests.post(apiRequest, files=files, headers=headers).json()

Upload Converted Model Examples

The following example shows uploading a Hugging Face model to a Wallaroo instance using the requests library. Note that the input_schema and output_schema encoded details are required.

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

encoded_input_schema = base64.b64encode(
                bytes(input_schema.serialize())
            ).decode("utf8")

encoded_output_schema = base64.b64encode(
                bytes(output_schema.serialize())
            ).decode("utf8")

metadata = {
    "name": model_name,
    "visibility": "private",
    "workspace_id": workspace_id,
    "conversion": {
        "framework": framework,
        "python_version": "3.8",
        "requirements": []
    },
    "input_schema": encoded_input_schema,
    "output_schema": encoded_output_schema,
}

headers = wl.auth.auth_header()

files = {
    'metadata': (None, json.dumps(metadata), "application/json"),
    'file': (model_name, open(model_path,'rb'),'application/octet-stream')
}

response = requests.post('{APIURL}/v1/api/models/upload_and_convert', 
                         headers=headers, 
                         files=files)

print(response.json())

{'insert_models': {'returning': [{'models': [{'id': 208}]}]}}

6 - Wallaroo MLOps API Essentials Guide: Pipeline Management

How to use the Wallaroo API for Pipeline Management

Pipeline Naming Requirements

Pipeline names map onto Kubernetes objects, and must be DNS compliant. Pipeline names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Pipeline Management

Pipelines are managed through the Wallaroo API or the Wallaroo SDK. Pipelines are the vehicle used for deploying, serving, and monitoring ML models. For more information, see the Wallaroo Glossary.

Create Pipeline in a Workspace

Creates a new pipeline in the specified workspace.

  • Parameters
    • pipeline_id - (REQUIRED string): Name of the new pipeline.
    • workspace_id - (REQUIRED int): Numerical id of the workspace for the new pipeline.
    • definition - (REQUIRED string): Pipeline definitions, can be {} for none.

Example: Two pipelines are created in the workspace created in the step Create Workspace. One will be an empty pipeline without any models, the other will be created using the uploaded models in the Upload Model to Workspace step and no configuration details. The pipeline id, variant id, and variant version of each pipeline will be stored for later examples.

The variable example_workspace_id was created in a previous example.

# Create pipeline in a workspace
# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/pipelines/create"

data = {
  "pipeline_id": empty_pipeline_name,
  "workspace_id": example_workspace_id,
  "definition": {}
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()

empty_pipeline_id = response['pipeline_pk_id']
empty_pipeline_variant_id=response['pipeline_variant_pk_id']
example_pipeline_variant_version=['pipeline_variant_version']
response
{'pipeline_pk_id': 7,
 'pipeline_variant_pk_id': 7,
 'pipeline_variant_version': 'a6dd2cee-58d6-4d24-9e25-f531dbbb95ad'}
# Create pipeline in a workspace with models
# Retrieve the token 
headers = wl.auth.auth_header()
api_request = f"{APIURL}/v1/api/pipelines/create"

data = {
  "pipeline_id": model_pipeline_name,
  "workspace_id": example_workspace_id,
  "definition": {
      'steps': [
          {
            'ModelInference': 
              {
                  'models': 
                    [
                        {
                            'name': model_name, 
                            'version': example_model_version, 
                            'sha': example_model_sha
                        }
                    ]
              }
          }
        ]
      }
    }

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
model_pipeline_id = response['pipeline_pk_id']
model_pipeline_variant_id=response['pipeline_variant_pk_id']
model_pipeline_variant_version=['pipeline_variant_version']
response
{'pipeline_pk_id': 8,
 'pipeline_variant_pk_id': 8,
 'pipeline_variant_version': '55f45c16-591e-4a16-8082-3ab6d843b484'}

Deploy a Current Pipeline Version

Deploy a an existing pipeline. Note that for any pipeline that has model steps, they must be included either in model_configs, model_ids or models.

  • Endpoint
    • /pipelines/deploy
  • Parameters
    • deploy_id (REQUIRED string): The name for the pipeline deployment.
    • engine_config (OPTIONAL string): Additional configuration options for the pipeline. These set the memory, replicas, and other settings. For example: {"cpus": 1, "replica_count": 1, "memory": "999Mi"} Available parameters include the following.
      • cpus: The number of CPUs to apply to the native runtime models in the pipeline. cpus can be a fraction of a cpu, for example "cpus": 0.25.
      • gpus: The number of GPUs to apply to the native runtime models. GPUs can only be allocated in whole numbers. Organizations should monitor how many GPUs are allocated to a pipelines to verify they have enough GPUs for all pipelines. If gpus is called, then the deployment_label must be called and match the GPU Nodepool for the Wallaroo Cluster hosting the Wallaroo instance.
      • replica_count: The number of replicas of the pipeline to deploy. This allows for multiple deployments of the same models to be deployed to increase inferences through parallelization.
      • replica_autoscale_min_max: Provides replicas to be scaled from 0 to some maximum number of replicas. This allows pipelines to spin up additional replicas as more resources are required, then spin them back down to save on resources and costs. For example: "replica_autoscale_min_max": {"maximum": 2, "minimum":0}
      • autoscale_cpu_utilization: Sets the average CPU percentage metric for when to load or unload another replica.
      • disable_autoscale: Disables autoscaling in the deployment configuration.
      • memory: Sets the amount of RAM to allocate the pipeline. The memory_spec string is in the format “{size as number}{unit value}”. The accepted unit values are:
        • KiB (for KiloBytes)
        • MiB (for MegaBytes)
        • GiB (for GigaBytes)
        • TiB (for TeraBytes)
      • lb_cpus: Sets the number or fraction of CPUs to use for the pipeline’s load balancer, for example: 0.25, 1, 1.5, etc. The units, similar to the Kubernetes CPU definitions.
      • lb_memory: Sets the amount of RAM to allocate the pipeline’s load balancer. The memory_spec string is in the format “{size as number}{unit value}”. The accepted unit values are:
        • KiB (for KiloBytes)
        • MiB (for MegaBytes)
        • GiB (for GigaBytes)
        • TiB (for TeraBytes)
      • deployment_label: Label used for Kubernetes labels.
      • arch: Determines which architecture to use. The available options are:
      • sidekick_cpus: Sets the number of CPUs to be used for the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. The parameters are as follows:
        • model: The sidekick model to configure.
        • core_count: Sets the number or fraction of CPUs to use.
      • sidekick_arch: Determines which architecture to use. The available options are:
      • sidekick_memory: Sets the memory available to for the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. The parameters are as follows:
        • model: The sidekick model to configure.
        • memory_spec: The amount of memory to allocated as memory unit values. The accepted unit values are:
          • KiB (for KiloBytes)
          • MiB (for MegaBytes)
          • GiB (for GigaBytes)
          • TiB (for TeraBytes)
      • sidekick_env: Environment variables submitted to the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. These are used specifically for containerized models that have environment variables that effect their performance. This takes the following parameters:
        • model: The sidekick model to configure.
        • environment: Dictionary inputs
      • sidekick_gpus: Sets the number of GPUs to allocate for containerized runtimes. GPUs are only allocated in whole units, not as fractions. Organizations should be aware of the total number of GPUs available to the cluster, and monitor which pipeline deployment configurations have GPUs allocated to ensure they do not run out. If there are not enough GPUs to allocate to a pipeline deployment configuration, and error message will be deployed when the pipeline is deployed. If gpus is called, then the deployment_label must be called and match the GPU Nodepool for the Wallaroo Cluster hosting the Wallaroo instance. This takes the following parameters:
        • model: The sidekick model to configure.
        • core_count: The number of GPUs to allocate.
    • pipeline_version_pk_id (REQUIRED int): Pipeline version id.
    • model_configs (OPTIONAL Array int): Ids of model configs to apply.
    • model_ids (OPTIONAL Array int): Ids of models to apply to the pipeline. If passed in, model_configs will be created automatically.
    • models (OPTIONAL Array models): If the model ids are not available as a pipeline step, the models’ data can be passed to it through this method. The options below are only required if models are provided as a parameter.
      • name (REQUIRED string): Name of the uploaded model that is in the same workspace as the pipeline.
      • version (REQUIRED string): Version of the model to use.
      • sha (REQUIRED string): SHA value of the model.
    • pipeline_id (REQUIRED int): Numerical value of the pipeline to deploy.
  • Returns
    • id (int): The deployment id.

 

Examples: Both the pipeline with model created in the step Create Pipeline in a Workspace will be deployed and their deployment information saved for later examples.

# Deploy a pipeline with models

# Retrieve the token 
headers = wl.auth.auth_header()
api_request = f"{APIURL}/v1/api/pipelines/deploy"

model_deploy_id=model_pipeline_name

# example_model_deploy_id="test deployment name"

data = {
    "deploy_id": model_deploy_id,
    "pipeline_version_pk_id": model_pipeline_variant_id,
    "models": [
        {
            "name": model_name,
            "version":example_model_version,
            "sha":example_model_sha
        }
    ],
    "pipeline_id": model_pipeline_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
display(response)
model_deployment_id=response['id']

# wait 45 seconds for the pipeline to complete deployment
import time
time.sleep(45)
{'id': 5}

Get Deployment Status

Returns the deployment status.

  • Parameters
    • name - (REQUIRED string): The deployment in the format {deployment_name}-{deploymnent-id}.

Example: The deployed empty and model pipelines status will be displayed.

# Retrieve the token 
headers = wl.auth.auth_header()

# Get model pipeline deployment

api_request = f"{APIURL}/v1/api/status/get_deployment"

data = {
  "name": f"{model_deploy_id}-{model_deployment_id}"
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.3.136',
   'name': 'engine-76b8f76d58-vbwqs',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'pipelinemodels',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'apimodel',
      'version': '0b989008-5f1d-453e-8085-98d97be1b722',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.4.166',
   'name': 'engine-lb-584f54c899-rdfkl',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

Get External Inference URL

The API command /admin/get_pipeline_external_url retrieves the external inference URL for a specific pipeline in a workspace.

  • Parameters
    • workspace_id (REQUIRED integer): The workspace integer id.
    • pipeline_name (REQUIRED string): The name of the deployment.

In this example, a list of the workspaces will be retrieved. Based on the setup from the Internal Pipeline Deployment URL Tutorial, the workspace matching urlworkspace will have it’s workspace id stored and used for the /admin/get_pipeline_external_url request with the pipeline urlpipeline.

The External Inference URL will be stored as a variable for the next step.

## Retrieve the pipeline's External Inference URL

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/admin/get_pipeline_external_url"

data = {
    "workspace_id": example_workspace_id,
    "pipeline_name": model_pipeline_name
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
print(response)
deployurl = response['url']
{'url': 'https://doc-test.api.wallarooexample.ai/v1/api/pipelines/infer/pipelinemodels-5/pipelinemodels'}

Perform Inference Through External URL

The inference can now be performed through the External Inference URL. This URL will accept the same inference data file that is used with the Wallaroo SDK, or with an Internal Inference URL as used in the Internal Pipeline Inference URL Tutorial.

Deployed pipelines have their own Inference URL that accepts HTTP POST submissions.

For connections that are external to the Kubernetes cluster hosting the Wallaroo instance, model endpoints must be enabled.

HTTP Headers

The following headers are required for connecting the the Pipeline Deployment URL:

  • Authorization: This requires the JWT token in the format 'Bearer ' + token. For example:

    Authorization: Bearer abcdefg==
    
  • Content-Type:

  • For DataFrame formatted JSON:

    Content-Type:application/json; format=pandas-records
    
  • For Arrow binary files, the Content-Type is application/vnd.apache.arrow.file.

    Content-Type:application/vnd.apache.arrow.file
    
  • IMPORTANT NOTE: Verify that the pipeline deployed has status Running before attempting an inference.

# Retrieve the token
headers = wl.auth.auth_header()

# set Content-Type type
headers['Content-Type']='application/json; format=pandas-records'

## Inference through external URL using dataframe

# retrieve the json data to submit
data = [
    {
        "tensor":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
]

# submit the request via POST, import as pandas DataFrame
response = pd.DataFrame.from_records(
    requests.post(
        deployurl, 
        json=data, 
        headers=headers)
        .json()
    )

display(response)
timeinoutcheck_failuresmetadata
01684356836285{'tensor': [1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756]}{'dense_1': [0.0014974177]}[]{'last_model': '{"model_name":"apimodel","model_sha":"bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507"}', 'pipeline_version': '', 'elapsed': [163502, 309804]}

Undeploy a Pipeline

Undeploys a deployed pipeline.

  • Parameters
    • pipeline_id - (REQUIRED int): The numerical id of the pipeline.
    • deployment_id - (REQUIRED int): The numerical id of the deployment.
  • Returns
    • Nothing if the call is successful.

Example: Both the empty pipeline and pipeline with models deployed in the step Deploy a Pipeline will be undeployed.

# Undeploy pipeline with models
# Retrieve the token 
headers = wl.auth.auth_header()
api_request = f"{APIURL}/v1/api/pipelines/undeploy"

data = {
    "pipeline_id": model_pipeline_id,
    "deployment_id":model_deployment_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response

Copy a Pipeline

Copies an existing pipeline into a new one in the same workspace. A new engine configuration can be set for the copied pipeline.

  • Parameters
    • name - (REQUIRED string): The name of the new pipeline.
    • workspace_id - (REQUIRED int): The numerical id of the workspace to copy the source pipeline from.
    • source_pipeline - (REQUIRED int): The numerical id of the pipeline to copy from.
    • deploy - (OPTIONAL string): Name of the deployment.
    • engine_config - (OPTIONAL string): Engine configuration options.
    • pipeline_version - (OPTIONAL string): Optional version of the copied pipeline to create.

Example: The pipeline with models created in the step Create Pipeline in a Workspace will be copied into a new one.

## Copy a pipeline

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/pipelines/copy"

data = {
  "name": example_copied_pipeline_name,
  "workspace_id": example_workspace_id,
  "source_pipeline": model_pipeline_id
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'pipeline_pk_id': 9,
 'pipeline_variant_pk_id': 9,
 'pipeline_version': None,
 'deployment': None}

7 - Wallaroo MLOps API Essentials Guide: Pipeline Log Management

How to use the Wallaroo API for Pipeline Log Management

Pipeline logs are retrieved through the Wallaroo MLOps API with the following request.

  • REQUEST URL
    • v1/api/pipelines/get_logs
  • Headers
    • Accept:
      • application/json; format=pandas-records: For the logs returned as pandas DataFrame
      • application/vnd.apache.arrow.file: for the logs returned as Apache Arrow
  • PARAMETERS
    • pipeline_id (String Required): The name of the pipeline.
    • workspace_id (Integer Required): The numerical identifier of the workspace.
    • cursor (String Optional): Cursor returned with a previous page of results from a pipeline log request, used to retrieve the next page of information.
    • order (String Optional Default: Desc): The order for log inserts returned. Valid values are:
      • Asc: In chronological order of inserts.
      • Desc: In reverse chronological order of inserts.
    • page_size (Integer Optional Default: 1000.): Max records per page.
    • start_time (String Optional): The start time of the period to retrieve logs for in RFC 3339 format for DateTime. Must be combined with end_time.
    • end_time (String Optional): The end time of the period to retrieve logs for in RFC 3339 format for DateTime. Must be combined with start_time.
  • RETURNS
    • The logs are returned by default as 'application/json; format=pandas-records' format. To request the logs as Apache Arrow tables, set the submission header Accept to application/vnd.apache.arrow.file.
    • Headers:
      • x-iteration-cursor: Used to retrieve the next page of results. This is not included if x-iteration-status is All.
      • x-iteration-status: Informs whether there are more records available outside of this log request parameters.
        • All: This page includes all logs available from this request. If x-iteration-status is All, then x-iteration-cursor is not provided.
        • SchemaChange: A change in the log schema caused by actions such as pipeline version, etc.
        • RecordLimited: The number of records exceeded from the page size, more records can be requested as the next page. There may be more records available to retrieve OR the record limit was reached for this request even if no more records are available in next cursor request.
        • ByteLimited: The number of records exceeded the pipeline log limit which is around 100K.

Standard Pipeline Logs Example

# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/pipelines/get_logs"

# Standard log retrieval

data = {
    'pipeline_id': main_pipeline_name,
    'workspace_id': workspace_id
}

response = requests.post(url, headers=headers, json=data)
standard_logs = pd.DataFrame.from_records(response.json())

display(standard_logs.head(5).loc[:, ["time", "in", "out"]])
timeinout
01684423875900{'tensor': [4.0, 2.5, 2900.0, 5505.0, 2.0, 0.0, 0.0, 3.0, 8.0, 2900.0, 0.0, 47.6063, -122.02, 2970.0, 5251.0, 12.0, 0.0, 0.0]}{'variable': [718013.75]}
11684423875900{'tensor': [2.0, 2.5, 2170.0, 6361.0, 1.0, 0.0, 2.0, 3.0, 8.0, 2170.0, 0.0, 47.7109, -122.017, 2310.0, 7419.0, 6.0, 0.0, 0.0]}{'variable': [615094.56]}
21684423875900{'tensor': [3.0, 2.5, 1300.0, 812.0, 2.0, 0.0, 0.0, 3.0, 8.0, 880.0, 420.0, 47.5893, -122.317, 1300.0, 824.0, 6.0, 0.0, 0.0]}{'variable': [448627.72]}
31684423875900{'tensor': [4.0, 2.5, 2500.0, 8540.0, 2.0, 0.0, 0.0, 3.0, 9.0, 2500.0, 0.0, 47.5759, -121.994, 2560.0, 8475.0, 24.0, 0.0, 0.0]}{'variable': [758714.2]}
41684423875900{'tensor': [3.0, 1.75, 2200.0, 11520.0, 1.0, 0.0, 0.0, 4.0, 7.0, 2200.0, 0.0, 47.7659, -122.341, 1690.0, 8038.0, 62.0, 0.0, 0.0]}{'variable': [513264.7]}

Shadow Deploy Pipeline Logs Example

# Retrieve logs from specific date/time to only get the two DataFrame input inferences in ascending format

# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/pipelines/get_logs"

# Standard log retrieval

data = {
    'pipeline_id': main_pipeline_name,
    'workspace_id': workspace_id,
    'order': 'Asc',
    'start_time': f'{shadow_date_start.isoformat()}',
    'end_time': f'{shadow_date_end.isoformat()}'
}

response = requests.post(url, headers=headers, json=data)
standard_logs = pd.DataFrame.from_records(response.json())

display(standard_logs.head(5).loc[:, ["time", "out", "out_logcontrolchallenger01", "out_logcontrolchallenger02"]])
timeoutout_logcontrolchallenger01out_logcontrolchallenger02
01684427140394{'variable': [718013.75]}{'variable': [659806.0]}{'variable': [704901.9]}
11684427140394{'variable': [615094.56]}{'variable': [732883.5]}{'variable': [695994.44]}
21684427140394{'variable': [448627.72]}{'variable': [419508.84]}{'variable': [416164.8]}
31684427140394{'variable': [758714.2]}{'variable': [634028.8]}{'variable': [655277.2]}
41684427140394{'variable': [513264.7]}{'variable': [427209.44]}{'variable': [426854.66]}

A/B Testing Pipeline Logs Example

# Retrieve logs from specific date/time to only get the two DataFrame input inferences in ascending format

# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/pipelines/get_logs"

# Standard log retrieval

data = {
    'pipeline_id': main_pipeline_name,
    'workspace_id': workspace_id,
    'order': 'Asc',
    'start_time': f'{ab_date_start.isoformat()}',
    'end_time': f'{ab_date_end.isoformat()}'
}

response = requests.post(url, headers=headers, json=data)
standard_logs = pd.DataFrame.from_records(response.json())

display(standard_logs.head(5).loc[:, ["time", "out"]])
timeout
01684427501820{'_model_split': ['{"name":"logcontrolchallenger02","version":"89dba25e-a11e-453d-9bcc-cddf8d6ddea0","sha":"ed6065a79d841f7e96307bb20d5ef22840f15da0b587efb51425c7ad60589d6a"}'], 'variable': [715947.75]}
11684427502196{'_model_split': ['{"name":"logcontrolchallenger01","version":"07fe7686-9bd0-4fd3-9a7c-0e933a74003c","sha":"31e92d6ccb27b041a324a7ac22cf95d9d6cc3aa7e8263a229f7c4aec4938657c"}'], 'variable': [341386.34]}
21684427503778{'_model_split': ['{"name":"logapicontrol","version":"70b76ecb-55c2-4d68-be9b-b440b11e6499","sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}'], 'variable': [1039781.2]}
31684427504566{'_model_split': ['{"name":"logcontrolchallenger02","version":"89dba25e-a11e-453d-9bcc-cddf8d6ddea0","sha":"ed6065a79d841f7e96307bb20d5ef22840f15da0b587efb51425c7ad60589d6a"}'], 'variable': [411090.75]}
41684427505094{'_model_split': ['{"name":"logcontrolchallenger02","version":"89dba25e-a11e-453d-9bcc-cddf8d6ddea0","sha":"ed6065a79d841f7e96307bb20d5ef22840f15da0b587efb51425c7ad60589d6a"}'], 'variable': [296175.66]}

Pipeline Log Storage

Pipeline logs have a set allocation of storage space and data requirements.

Pipeline Log Storage Warnings

To prevent storage and performance issues, inference result data may be dropped from pipeline logs by the following standards:

  • Columns are progressively removed from the row starting with the largest input data size and working to the smallest, then the same for outputs.

For example, Computer Vision ML Models typically have large inputs and output values - a single pandas DataFrame inference request may be over 13 MB in size, and the inference results nearly as large. To prevent pipeline log storage issues, the input may be dropped from the pipeline logs, and if additional space is needed, the inference outputs would follow. The time column is preserved.

If a pipeline has dropped columns for space purposes, this will be displayed when a log request is made with the following warning, with {columns} replaced with the dropped columns.

The inference log is above the allowable limit and the following columns may have been suppressed for various rows in the logs: {columns}. To review the dropped columns for an individual inferences suppressed data, include dataset=["metadata"] in the log request.

Review Dropped Columns

To review what columns are dropped from pipeline logs for storage reasons, include the dataset metadata in the request to view the column metadata.dropped. This metadata field displays a List of any columns dropped from the pipeline logs.

For example:

# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/pipelines/get_logs"

# Standard log retrieval

data = {
    'pipeline_id': main_pipeline_name,
    'workspace_id': workspace_id
}

response = requests.post(url, headers=headers, json=data)
standard_logs = pd.DataFrame.from_records(response.json())

display(len(standard_logs))
display(standard_logs.head(5).loc[:, ["time", "metadata"]])
cursor = response.headers['x-iteration-cursor']
 timemetadata
01688760035752{’last_model’: ‘{“model_name”:“logapicontrol”,“model_sha”:“e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6”}’, ‘pipeline_version’: ‘’, ’elapsed’: [112967, 267146], ‘dropped’: []}
11688760036054{’last_model’: ‘{“model_name”:“logapicontrol”,“model_sha”:“e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6”}’, ‘pipeline_version’: ‘’, ’elapsed’: [37127, 594183], ‘dropped’: []}
21688759857040{’last_model’: ‘{“model_name”:“logapicontrol”,“model_sha”:“e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6”}’, ‘pipeline_version’: ‘’, ’elapsed’: [111082, 253184], ‘dropped’: []}
31688759857526{’last_model’: ‘{“model_name”:“logapicontrol”,“model_sha”:“e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6”}’, ‘pipeline_version’: ‘’, ’elapsed’: [43962, 265740], ‘dropped’: []}

Suppressed Data Elements

Data elements that do not fit the supported data types below, such as None or Null values, are not supported in pipeline logs. When present, undefined data will be written in the place of the null value, typically zeroes. Any null list values will present an empty list.

8 - Wallaroo MLOps API Essentials Guide: Enablement Management

How to use the Wallaroo API for Enablement Management

Enablement Management

Enablement Management allows users to see what Wallaroo features have been activated.

List Enablement Features

Lists the enablement features for the Wallaroo instance.

  • PARAMETERS
    • null: An empty set {}
  • RETURNS
    • features - (string): Enabled features.
    • name - (string): Name of the Wallaroo instance.
    • is_auth_enabled - (bool): Whether authentication is enabled.
# List enablement features
# Retrieve the token 
headers = wl.auth.auth_header()
api_request = f"{APIURL}/v1/api/features/list"

data = {
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'features': {'plateau': 'true'},
 'name': 'Wallaroo Dev',
 'is_auth_enabled': True}

9 - Wallaroo MLOps API Essentials Guide: Assays Management

How to use the Wallaroo API for Assays Management

Assays

IMPORTANT NOTE: These assays were run in a Wallaroo environment with canned historical data. See the Wallaroo Assay Tutorial for details on setting up this environment. This historical data is required for these examples.

Create Assay

Create a new array in a specified pipeline.

  • PARAMETERS
    • id - (OPTIONAL int): The numerical identifier for the assay.
    • name - (REQUIRED string): The name of the assay.
    • pipeline_id - (REQUIRED int): The numerical idenfifier the assay will be placed into.
    • pipeline_name - (REQUIRED string): The name of the pipeline
    • active - (REQUIRED bool): Indicates whether the assay will be active upon creation or not.
    • status - (REQUIRED string): The status of the assay upon creation.
    • iopath - (REQUIRED string): The iopath of the assay in the format "input|output field_name field_index.
    • baseline - (REQUIRED baseline): The baseline for the assay.
      • Fixed - (REQUIRED AssayFixConfiguration): The fixed configuration for the assay.
        • pipeline - (REQUIRED string): The name of the pipeline with the baseline data.
        • model - (REQUIRED string): The name of the model used.
        • start_at - (REQUIRED string): The DateTime of the baseline start date.
        • end_at - (REQUIRED string): The DateTime of the baseline end date.
    • window (REQUIRED AssayWindow): Assay window.
      • pipeline - (REQUIRED string): The name of the pipeline for the assay window.
      • model - (REQUIRED string): The name of the model used for the assay window.
      • width - (REQUIRED string): The width of the assay window.
      • start - (OPTIONAL string): The DateTime of when to start the assay window.
      • interval - (OPTIONAL string): The assay window interval.
    • summarizer - (REQUIRED AssaySummerizer): The summarizer type for the array aka “advanced settings” in the Wallaroo Dashboard UI.
      • type - (REQUIRED string): Type of summarizer.
      • bin_mode - (REQUIRED string): The binning model type. Values can be:
        • Quantile
        • Equal
      • aggregation - (REQUIRED string): Aggregation type.
      • metric - (REQUIRED string): Metric type. Values can be:
        • PSI
        • Maximum Difference of Bins
        • Sum of the Difference of Bins
      • num_bins - (REQUIRED int): The number of bins. Recommanded values are between 5 and 14.
      • bin_weights - (OPTIONAL AssayBinWeight): The weights assigned to the assay bins.
      • bin_width - (OPTIONAL AssayBinWidth): The width assigned to the assay bins.
      • provided_edges - (OPTIONAL AssayProvidedEdges): The edges used for the assay bins.
      • add_outlier_edges - (REQUIRED bool): Indicates whether to add outlier edges or not.
    • warning_threshold - (OPTIONAL number): Optional warning threshold.
    • alert_threshold - (REQUIRED number): Alert threshold.
    • run_until - (OPTIONAL string): DateTime of when to end the assay.
    • workspace_id - (REQUIRED integer): The workspace the assay is part of.
    • model_insights_url - (OPTIONAL string): URL for model insights.
  • RETURNS
    • assay_id - (integer): The id of the new assay.

As noted this example requires the Wallaroo Assay Tutorial for historical data. Before running this example, set the sample pipeline id, pipeline, name, model name, and workspace id in the code sample below.

For our example, we will be using the output of the field dense_2 at the index 0 for the iopath.

For more information on retrieving this information, see the Wallaroo Developer Guides.

# Retrieve information for the housepricedrift workspace

# List workspaces

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/workspaces/list"

data = {
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
assay_workspace = next(workspace for workspace in response["workspaces"] if workspace["name"] == "housepricedrift")
assay_workspace_id = assay_workspace['id']
assay_pipeline_id = assay_workspace['pipelines'][0]
## Create assay

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/assays/create"

exampleAssayName = "api assay"

## Now get all of the assays for the pipeline in workspace 4 `housepricedrift`

exampleAssayPipelineId = assay_pipeline_id
exampleAssayPipelineName = "housepricepipe"
exampleAssayModelName = "housepricemodel"
exampleAssayWorkspaceId = assay_workspace_id

# iopath can be input 00 or output 0 0
data = {
    'name': exampleAssayName,
    'pipeline_id': exampleAssayPipelineId,
    'pipeline_name': exampleAssayPipelineName,
    'active': True,
    'status': 'active',
    'iopath': "output dense_2 0",
    'baseline': {
        'Fixed': {
            'pipeline': exampleAssayPipelineName,
            'model': exampleAssayModelName,
            'start_at': '2023-01-01T00:00:00-05:00',
            'end_at': '2023-01-02T00:00:00-05:00'
        }
    },
    'window': {
        'pipeline': exampleAssayPipelineName,
        'model': exampleAssayModelName,
        'width': '24 hours',
        'start': None,
        'interval': None
    },
    'summarizer': {
        'type': 'UnivariateContinuous',
        'bin_mode': 'Quantile',
        'aggregation': 'Density',
        'metric': 'PSI',
        'num_bins': 5,
        'bin_weights': None,
        'bin_width': None,
        'provided_edges': None,
        'add_outlier_edges': True
    },
    'warning_threshold': 0,
    'alert_threshold': 0.1,
    'run_until': None,
    'workspace_id': exampleAssayWorkspaceId
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
example_assay_id = response['assay_id']
response
{'assay_id': 5}

List Assays

Lists all assays in the specified pipeline.

  • PARAMETERS
    • pipeline_id - (REQUIRED int): The numerical ID of the pipeline.
  • RETURNS
    • assays - (Array assays): A list of all assays.

Example: Display a list of all assays in a workspace. This will assume we have a workspace with an existing Assay and the associated data has been upload. See the tutorial Wallaroo Assays Tutorial.

For this reason, these values are hard coded for now.

# Get assays
# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/assays/list"

data = {
    "pipeline_id": exampleAssayPipelineId
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
[{'id': 5,
  'name': 'api assay',
  'active': True,
  'status': 'created',
  'warning_threshold': 0.0,
  'alert_threshold': 0.1,
  'pipeline_id': 1,
  'pipeline_name': 'housepricepipe',
  'last_run': None,
  'next_run': '2023-05-19T17:26:51.743327+00:00',
  'run_until': None,
  'updated_at': '2023-05-19T17:26:51.745495+00:00',
  'baseline': {'Fixed': {'pipeline': 'housepricepipe',
    'model': 'housepricemodel',
    'start_at': '2023-01-01T00:00:00-05:00',
    'end_at': '2023-01-02T00:00:00-05:00'}},
  'window': {'pipeline': 'housepricepipe',
   'model': 'housepricemodel',
   'width': '24 hours',
   'start': None,
   'interval': None},
  'summarizer': {'type': 'UnivariateContinuous',
   'bin_mode': 'Quantile',
   'aggregation': 'Density',
   'metric': 'PSI',
   'num_bins': 5,
   'bin_weights': None,
   'provided_edges': None}},
 {'id': 2,
  'name': 'onmyexample assay',
  'active': True,
  'status': 'created',
  'warning_threshold': None,
  'alert_threshold': 0.5,
  'pipeline_id': 1,
  'pipeline_name': 'housepricepipe',
  'last_run': None,
  'next_run': '2023-05-17T21:46:58.633746+00:00',
  'run_until': None,
  'updated_at': '2023-05-17T21:46:58.636786+00:00',
  'baseline': {'Fixed': {'pipeline': 'housepricepipe',
    'model': 'housepricemodel',
    'start_at': '2023-01-01T00:00:00+00:00',
    'end_at': '2023-01-02T00:00:00+00:00'}},
  'window': {'pipeline': 'housepricepipe',
   'model': 'housepricemodel',
   'width': '24 hours',
   'start': None,
   'interval': None},
  'summarizer': {'type': 'UnivariateContinuous',
   'bin_mode': 'Quantile',
   'aggregation': 'Density',
   'metric': 'PSI',
   'num_bins': 5,
   'bin_weights': None,
   'provided_edges': None}},
 {'id': 1,
  'name': 'api_assay',
  'active': True,
  'status': 'created',
  'warning_threshold': 0.0,
  'alert_threshold': 0.1,
  'pipeline_id': 1,
  'pipeline_name': 'housepricepipe',
  'last_run': None,
  'next_run': '2023-05-17T20:54:09.27658+00:00',
  'run_until': None,
  'updated_at': '2023-05-17T20:54:09.27932+00:00',
  'baseline': {'Fixed': {'pipeline': 'housepricepipe',
    'model': 'housepricemodel',
    'start_at': '2023-01-01T00:00:00-05:00',
    'end_at': '2023-01-02T00:00:00-05:00'}},
  'window': {'pipeline': 'housepricepipe',
   'model': 'housepricemodel',
   'width': '24 hours',
   'start': None,
   'interval': None},
  'summarizer': {'type': 'UnivariateContinuous',
   'bin_mode': 'Quantile',
   'aggregation': 'Density',
   'metric': 'PSI',
   'num_bins': 5,
   'bin_weights': None,
   'provided_edges': None}}]

Activate or Deactivate Assay

Activates or deactivates an existing assay.

  • Parameters
    • id - (REQUIRED int): The numerical id of the assay.
    • active - (REQUIRED bool): True to activate the assay, False to deactivate it.
  • Returns
      • id - (integer): The numerical id of the assay.
    • active - (bool): True to activate the assay, False to deactivate it.

Example: Assay 8 “House Output Assay” will be deactivated then activated.

# Deactivate assay

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/assays/set_active"

data = {
    'id': example_assay_id,
    'active': False
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'id': 5, 'active': False}
# Activate assay

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/assays/set_active"

data = {
    'id': example_assay_id,
    'active': True
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'id': 5, 'active': True}

Create Interactive Baseline

Creates an interactive assay baseline.

  • PARAMETERS
    • id - (REQUIRED int): The numerical identifier for the assay.
    • name - (REQUIRED string): The name of the assay.
    • pipeline_id - (REQUIRED int): The numerical idenfifier the assay will be placed into.
    • pipeline_name - (REQUIRED string): The name of the pipeline
    • active - (REQUIRED bool): Indicates whether the assay will be active upon creation or not.
    • status - (REQUIRED string): The status of the assay upon creation.
    • iopath - (REQUIRED string): The iopath of the assay.
    • baseline - (REQUIRED baseline): The baseline for the assay.
      • Fixed - (REQUIRED AssayFixConfiguration): The fixed configuration for the assay.
        • pipeline - (REQUIRED string): The name of the pipeline with the baseline data.
        • model - (REQUIRED string): The name of the model used.
        • start_at - (REQUIRED string): The DateTime of the baseline start date.
        • end_at - (REQUIRED string): The DateTime of the baseline end date.
    • window (REQUIRED AssayWindow): Assay window.
      • pipeline - (REQUIRED string): The name of the pipeline for the assay window.
      • model - (REQUIRED string): The name of the model used for the assay window.
      • width - (REQUIRED string): The width of the assay window.
      • start - (OPTIONAL string): The DateTime of when to start the assay window.
      • interval - (OPTIONAL string): The assay window interval.
    • summarizer - (REQUIRED AssaySummerizer): The summarizer type for the array aka “advanced settings” in the Wallaroo Dashboard UI.
      • type - (REQUIRED string): Type of summarizer.
      • bin_mode - (REQUIRED string): The binning model type. Values can be:
        • Quantile
        • Equal
      • aggregation - (REQUIRED string): Aggregation type.
      • metric - (REQUIRED string): Metric type. Values can be:
        • PSI
        • Maximum Difference of Bins
        • Sum of the Difference of Bins
      • num_bins - (REQUIRED int): The number of bins. Recommanded values are between 5 and 14.
      • bin_weights - (OPTIONAL AssayBinWeight): The weights assigned to the assay bins.
      • bin_width - (OPTIONAL AssayBinWidth): The width assigned to the assay bins.
      • provided_edges - (OPTIONAL AssayProvidedEdges): The edges used for the assay bins.
      • add_outlier_edges - (REQUIRED bool): Indicates whether to add outlier edges or not.
    • warning_threshold - (OPTIONAL number): Optional warning threshold.
    • alert_threshold - (REQUIRED number): Alert threshold.
    • run_until - (OPTIONAL string): DateTime of when to end the assay.
    • workspace_id - (REQUIRED integer): The workspace the assay is part of.
    • model_insights_url - (OPTIONAL string): URL for model insights.
  • RETURNS

Example: An interactive assay baseline will be set for the assay “Test Assay” on Pipeline 4.

## Run interactive baseline

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/assays/run_interactive_baseline"

data = {
    'id': example_assay_id,
    'name': exampleAssayName,
    'pipeline_id': exampleAssayPipelineId,
    'pipeline_name': exampleAssayPipelineName,
    'active': True,
    'status': 'active',
    'iopath': "output dense_2 0",
    'baseline': {
        'Fixed': {
            'pipeline': exampleAssayPipelineName,
            'model': exampleAssayModelName,
            'start_at': '2023-01-01T00:00:00-05:00',
            'end_at': '2023-01-02T00:00:00-05:00'
        }
    },
    'window': {
        'pipeline': exampleAssayPipelineName,
        'model': exampleAssayModelName,
        'width': '24 hours',
        'start': None,
        'interval': None
    },
    'summarizer': {
        'type': 'UnivariateContinuous',
        'bin_mode': 'Quantile',
        'aggregation': 'Density',
        'metric': 'PSI',
        'num_bins': 5,
        'bin_weights': None,
        'bin_width': None,
        'provided_edges': None,
        'add_outlier_edges': True
    },
    'warning_threshold': 0,
    'alert_threshold': 0.1,
    'run_until': None,
    'workspace_id': exampleAssayWorkspaceId
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response
{'id': None,
 'assay_id': 5,
 'window_start': '2023-01-01T05:00:00Z',
 'analyzed_at': '2023-05-19T17:26:59.664583293Z',
 'elapsed_millis': 0,
 'iopath': 'output dense_2 0',
 'pipeline_id': None,
 'baseline_summary': {'count': 181,
  'min': 12.002464294433594,
  'max': 14.095687866210938,
  'mean': 12.892810610776449,
  'median': 12.862584114074709,
  'std': 0.4259400394014014,
  'edges': [12.002464294433594,
   12.525982856750488,
   12.772802352905272,
   12.960931777954102,
   13.246906280517578,
   14.095687866210938,
   None],
  'edge_names': ['left_outlier',
   'q_20',
   'q_40',
   'q_60',
   'q_80',
   'q_100',
   'right_outlier'],
  'aggregated_values': [0.0,
   0.19889502762430936,
   0.19889502762430936,
   0.20441988950276244,
   0.19889502762430936,
   0.19889502762430936,
   0.0],
  'aggregation': 'Density',
  'start': '2023-01-01T05:00:00Z',
  'end': '2023-01-02T05:00:00Z'},
 'window_summary': {'count': 181,
  'min': 12.002464294433594,
  'max': 14.095687866210938,
  'mean': 12.892810610776449,
  'median': 12.862584114074709,
  'std': 0.4259400394014014,
  'edges': [12.002464294433594,
   12.525982856750488,
   12.772802352905272,
   12.960931777954102,
   13.246906280517578,
   14.095687866210938,
   None],
  'edge_names': ['left_outlier',
   'e_1.25e1',
   'e_1.28e1',
   'e_1.30e1',
   'e_1.32e1',
   'e_1.41e1',
   'right_outlier'],
  'aggregated_values': [0.0,
   0.19889502762430936,
   0.19889502762430936,
   0.20441988950276244,
   0.19889502762430936,
   0.19889502762430936,
   0.0],
  'aggregation': 'Density',
  'start': '2023-01-01T05:00:00Z',
  'end': '2023-01-02T05:00:00Z'},
 'warning_threshold': 0.0,
 'alert_threshold': 0.1,
 'score': 0.0,
 'scores': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 'bin_index': None,
 'summarizer': {'type': 'UnivariateContinuous',
  'bin_mode': 'Quantile',
  'aggregation': 'Density',
  'metric': 'PSI',
  'num_bins': 5,
  'bin_weights': None,
  'provided_edges': None},
 'status': 'BaselineRun',
 'created_at': None}

Get Assay Baseline

Retrieve an assay baseline.

  • Parameters
    • workspace_id - (REQUIRED integer): Numerical id for the workspace the assay is in.
    • pipeline_name - (REQUIRED string): Name of the pipeline the assay is in.
    • start - (OPTIONAL string): DateTime for when the baseline starts.
    • end - (OPTIONAL string): DateTime for when the baseline ends.
    • model_name - (OPTIONAL string): Name of the model.
    • limit - (OPTIONAL integer): Maximum number of baselines to return.
  • Returns
    • Assay Baseline

Example: 3 assay baselines for Workspace 6 and pipeline houseprice-pipe will be retrieved.

## Get Assay Baseline

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/assays/get_baseline"

data = {
    'workspace_id': exampleAssayWorkspaceId,
    'pipeline_name': exampleAssayPipelineName,
    'limit': 3
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response[0:2]
[{'time': 1672531200000,
  'in.tensor': [0.6752651953165153,
   -0.4732014898144956,
   -1.0785881334179752,
   0.25006446993148707,
   -0.08666382440547035,
   0.012211745933432551,
   -0.1904726364343265,
   -0.9179715198382244,
   -0.305653139057544,
   0.905425782569012,
   -0.5584151415472702,
   -0.8905121321380776,
   1.7014907488187343,
   -0.03617359856638,
   -0.20817781526102327,
   -0.4017891748132812,
   -0.19176790501742016],
  'out.dense_2': [12.529610633850098],
  'metadata.last_model': '{"model_name": "housepricemodel", "model_sha": "test_version"}',
  'metadata.profile': '{"elapsed_ns": 243}'},
 {'time': 1672531676753,
  'in.tensor': [-0.39764636440424433,
   -0.4732014898144956,
   0.5769261528142077,
   0.07215545493232875,
   -0.08666382440547035,
   0.5668723158705202,
   0.0035716408873876734,
   -0.9179715198382244,
   -0.305653139057544,
   0.905425782569012,
   0.29288456205300767,
   -0.10763168763453018,
   1.3841294506067472,
   -0.13822039562434324,
   -0.20817781526102327,
   1.392623186456163,
   0.0831911918963078],
  'out.dense_2': [13.355737686157228],
  'metadata.last_model': '{"model_name": "housepricemodel", "model_sha": "test_version"}',
  'metadata.profile': '{"elapsed_ns": 216}'}]

Run Assay Interactively

Runs an assay.

  • Parameters
    • id - (REQUIRED int): The numerical identifier for the assay.
    • name - (REQUIRED string): The name of the assay.
    • pipeline_id - (REQUIRED int): The numerical idenfifier the assay will be placed into.
    • pipeline_name - (REQUIRED string): The name of the pipeline
    • active - (REQUIRED bool): Indicates whether the assay will be active upon creation or not.
    • status - (REQUIRED string): The status of the assay upon creation.
    • iopath - (REQUIRED string): The iopath of the assay.
    • baseline - (REQUIRED baseline): The baseline for the assay.
      • Fixed - (REQUIRED AssayFixConfiguration): The fixed configuration for the assay.
        • pipeline - (REQUIRED string): The name of the pipeline with the baseline data.
        • model - (REQUIRED string): The name of the model used.
        • start_at - (REQUIRED string): The DateTime of the baseline start date.
        • end_at - (REQUIRED string): The DateTime of the baseline end date.
    • window (REQUIRED AssayWindow): Assay window.
      • pipeline - (REQUIRED string): The name of the pipeline for the assay window.
      • model - (REQUIRED string): The name of the model used for the assay window.
      • width - (REQUIRED string): The width of the assay window.
      • start - (OPTIONAL string): The DateTime of when to start the assay window.
      • interval - (OPTIONAL string): The assay window interval.
    • summarizer - (REQUIRED AssaySummerizer): The summarizer type for the array aka “advanced settings” in the Wallaroo Dashboard UI.
      • type - (REQUIRED string): Type of summarizer.
      • bin_mode - (REQUIRED string): The binning model type. Values can be:
        • Quantile
        • Equal
      • aggregation - (REQUIRED string): Aggregation type.
      • metric - (REQUIRED string): Metric type. Values can be:
        • PSI
        • Maximum Difference of Bins
        • Sum of the Difference of Bins
      • num_bins - (REQUIRED int): The number of bins. Recommanded values are between 5 and 14.
      • bin_weights - (OPTIONAL AssayBinWeight): The weights assigned to the assay bins.
      • bin_width - (OPTIONAL AssayBinWidth): The width assigned to the assay bins.
      • provided_edges - (OPTIONAL AssayProvidedEdges): The edges used for the assay bins.
      • add_outlier_edges - (REQUIRED bool): Indicates whether to add outlier edges or not.
    • warning_threshold - (OPTIONAL number): Optional warning threshold.
    • alert_threshold - (REQUIRED number): Alert threshold.
    • run_until - (OPTIONAL string): DateTime of when to end the assay.
    • workspace_id - (REQUIRED integer): The workspace the assay is part of.
    • model_insights_url - (OPTIONAL string): URL for model insights.
  • Returns
    • Assay

Example: An interactive assay will be run for Assay exampleAssayId exampleAssayName. Depending on the number of assay results and the data window, this may take some time. This returns all of the results for this assay at this time. The total number of responses will be displayed after.

## Run interactive assay

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/assays/run_interactive"

data = {
    'id': example_assay_id,
    'name': exampleAssayName,
    'pipeline_id': exampleAssayPipelineId,
    'pipeline_name': exampleAssayPipelineName,
    'active': True,
    'status': 'active',
    'iopath': "output dense_2 0",
    'baseline': {
        'Fixed': {
            'pipeline': exampleAssayPipelineName,
            'model': exampleAssayModelName,
            'start_at': '2023-01-01T00:00:00-05:00',
            'end_at': '2023-01-02T00:00:00-05:00'
        }
    },
    'window': {
        'pipeline': exampleAssayPipelineName,
        'model': exampleAssayModelName,
        'width': '24 hours',
        'start': None,
        'interval': None
    },
    'summarizer': {
        'type': 'UnivariateContinuous',
        'bin_mode': 'Quantile',
        'aggregation': 'Density',
        'metric': 'PSI',
        'num_bins': 5,
        'bin_weights': None,
        'bin_width': None,
        'provided_edges': None,
        'add_outlier_edges': True
    },
    'warning_threshold': 0,
    'alert_threshold': 0.1,
    'run_until': None,
    'workspace_id': exampleAssayWorkspaceId
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response[0]
{'id': None,
 'assay_id': 1,
 'window_start': '2023-01-02T05:00:00Z',
 'analyzed_at': '2023-05-17T20:54:19.568121901Z',
 'elapsed_millis': 578,
 'iopath': 'output dense_2 0',
 'pipeline_id': None,
 'baseline_summary': {'count': 181,
  'min': 12.002464294433594,
  'max': 14.095687866210938,
  'mean': 12.892810610776449,
  'median': 12.862584114074709,
  'std': 0.4259400394014014,
  'edges': [12.002464294433594,
   12.525982856750488,
   12.772802352905272,
   12.960931777954102,
   13.246906280517578,
   14.095687866210938,
   None],
  'edge_names': ['left_outlier',
   'q_20',
   'q_40',
   'q_60',
   'q_80',
   'q_100',
   'right_outlier'],
  'aggregated_values': [0.0,
   0.19889502762430936,
   0.19889502762430936,
   0.20441988950276244,
   0.19889502762430936,
   0.19889502762430936,
   0.0],
  'aggregation': 'Density',
  'start': '2023-01-01T05:00:00Z',
  'end': '2023-01-02T05:00:00Z'},
 'window_summary': {'count': 182,
  'min': 12.037200927734377,
  'max': 14.712774276733398,
  'mean': 12.966292286967184,
  'median': 12.895143508911133,
  'std': 0.4705339357836451,
  'edges': [12.002464294433594,
   12.525982856750488,
   12.772802352905272,
   12.960931777954102,
   13.246906280517578,
   14.095687866210938,
   None],
  'edge_names': ['left_outlier',
   'e_1.25e1',
   'e_1.28e1',
   'e_1.30e1',
   'e_1.32e1',
   'e_1.41e1',
   'right_outlier'],
  'aggregated_values': [0.0,
   0.17032967032967034,
   0.17582417582417584,
   0.23626373626373623,
   0.1978021978021978,
   0.1978021978021978,
   0.02197802197802198],
  'aggregation': 'Density',
  'start': '2023-01-02T05:00:00Z',
  'end': '2023-01-03T05:00:00Z'},
 'warning_threshold': 0.0,
 'alert_threshold': 0.1,
 'score': 0.037033182069201614,
 'scores': [0.0,
  0.0044288126945783235,
  0.00284446741288289,
  0.004610114809454446,
  6.021116179797982e-06,
  6.021116179797982e-06,
  0.02513774491992636],
 'bin_index': None,
 'summarizer': {'type': 'UnivariateContinuous',
  'bin_mode': 'Quantile',
  'aggregation': 'Density',
  'metric': 'PSI',
  'num_bins': 5,
  'bin_weights': None,
  'provided_edges': None},
 'status': 'Warning',
 'created_at': None}
print(len(response))
30

Get Assay Results

Retrieve the results for an assay.

  • Parameters
    • assay_id - (REQUIRED integer): Numerical id for the assay.
    • start - (OPTIONAL string): DateTime for when the baseline starts.
    • end - (OPTIONAL string): DateTime for when the baseline ends.
    • limit - (OPTIONAL integer): Maximum number of results to return.
    • pipeline_id - (OPTIONAL integer): Numerical id of the pipeline the assay is in.
  • Returns
    • Assay Baseline
# Get Assay Results

# Retrieve the token 
headers = wl.auth.auth_header()

api_request = f"{APIURL}/v1/api/assays/get_results"

data = {
    'assay_id': example_assay_id,
    'pipeline_id': exampleAssayPipelineId
}

response = requests.post(api_request, json=data, headers=headers, verify=True).json()
response

10 - Wallaroo MLOps API Essentials Guide: Connections Management

How to use the Wallaroo API for Connections Management

Wallaroo Data Connections are provided to establish connections to external data stores for requesting or submitting information. They provide a source of truth for data source connection information to enable repeatability and access control within your ecosystem. The actual implementation of data connections are managed through other means, such as Wallaroo Pipeline Orchestrators and Tasks, where the libraries and other tools used for the data connection can be stored.

Create Connection

Lists the enablement features for the Wallaroo instance.

  • REQUEST PATH
    • POST /v1/api/connections/create
  • PARAMETERS
    • name (String Required): The name of the connection.
    • type (String Required): The user defined type of connection.
    • details (String Required): User defined configuration details for the data connection. These can be {'username':'dataperson', 'password':'datapassword', 'port': 3339}, or {'token':'abcde123==', 'host':'example.com', 'port:1234'}, or other user defined combinations.
  • RETURNS
    • id (String): The id of the connection id in UUID format.
# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/connections/create"

# input connection
data = {
    'name': bigquery_connection_input_name,
    'type' : bigquery_connection_input_type,
    'details': bigquery_connection_input_argument
}

response=requests.post(url, headers=headers, json=data).json()
display(response)

{'id': '02e587e9-e564-4af4-b3c4-0800994febfd'}

Add Connection to Workspace

Adds a connection to a Wallaroo workspace, accessible by workspace members.

  • REQUEST PATH
    • POST /v1/api/connections/create
  • PARAMETERS
    • workspace_id (String Required): The name of the connection.
    • connection_id (String Required): The UUID connection ID
  • RETURNS
    • id (String): The id of the workspace’s connection id in UUID format. Note that the workspace’s connection id is separate from the Wallaroo connection id.
# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/connections/add_to_workspace"

data = {
    'workspace_id': workspace_id,
    'connection_id': connection_input_id
}

response=requests.post(url, headers=headers, json=data)
display(response.json())

{'id': '0c470074-d864-4467-bcd4-f6e8de7794e1'}

List Connections in Workspace

Lists the connections set to the specified workspace.

Adds a connection to a Wallaroo workspace, accessible by workspace members.

  • REQUEST PATH
    • POST /v1/api/connections/create
  • PARAMETERS
    • workspace_id (Integer Required): The numerical id of the workspace.
  • RETURNS
    • List[workspaces]: List of the connections in the workspace.
      • name (String): The id of the workspace’s connection id in UUID format. Note that the connection’s workspace connection id is separate from the Wallaroo connection id.
      • type (String): The user defined type of connection.
      • details (dict): The user defined details of the connection.
      • created_at (String): Timestamp of the connection creation date.
      • workspace_names (List[String]): List of workspaces the connection is attached to.
# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/connections/list"

# output connection
data = {
    'workspace_id': workspace_id
}

response=requests.post(url, headers=headers, json=data).json()
display(response)

{'connections': [{'id': '2274dd22-2a18-454a-8cde-383cc013928b',
   'name': 'bigqueryhouseapiinput',
   'type': 'BIGQUERY',
   'details': {'type': 'service_account',
    'table': 'housepricesaga_inputs',
    'dataset': 'release_testing_2023_2',
   'created_at': '2023-05-16T16:43:05.170914Z',
   'workspace_names': ['bigqueryapiworkspace']}
    ]
}

Get Connection by Name

Retrieves the connection by the unique connection name.

  • REQUEST PATH
    • POST /v1/api/connections/get
  • PARAMETERS
    • name (String Required): The name of the connection.
  • RETURNS
    • name (String): The id of the workspace’s connection id in UUID format. Note that the workspace’s connection id is separate from the Wallaroo connection id.
    • type (String): The user defined type of connection.
    • details (dict): The user defined details of the connection.
    • created_at (String): Timestamp of the connection creation date.
    • workspace_names (List[String]): List of workspaces the connection is attached to.
# get the connection input details

# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/connections/get"

data = {
    'name': bigquery_connection_input_name
}

display(requests.post(url, headers=headers, json=data).json())

{'id': '2274dd22-2a18-454a-8cde-383cc013928b',
   'name': 'bigqueryhouseapiinput',
   'type': 'BIGQUERY',
   'details': {'type': 'service_account',
    'table': 'housepricesaga_inputs',
    'dataset': 'release_testing_2023_2',
   'created_at': '2023-05-16T16:43:05.170914Z',
   'workspace_names': ['bigqueryapiworkspace']}

Get Connection by Id

Retrieves the connection by the unique connection id.

  • REQUEST PATH
    • POST /v1/api/connections/get_by_id
  • PARAMETERS
    • id (String Required): The id of the connection.
  • RETURNS
    • name (String): The id of the workspace’s connection id in UUID format. Note that the workspace’s connection id is separate from the Wallaroo connection id.
    • type (String): The user defined type of connection.
    • details (dict): The user defined details of the connection.
    • created_at (String): Timestamp of the connection creation date.
    • workspace_names (List[String]): List of workspaces the connection is attached to.
# get the connection input details

# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/connections/get_by_id"

data = {
    'id': bigquery_connection_input_name
}

display(requests.post(url, headers=headers, json=data).json())

{'id': '2274dd22-2a18-454a-8cde-383cc013928b',
   'name': 'bigqueryhouseapiinput',
   'type': 'BIGQUERY',
   'details': {'type': 'service_account',
    'table': 'housepricesaga_inputs',
    'dataset': 'release_testing_2023_2',
   'created_at': '2023-05-16T16:43:05.170914Z',
   'workspace_names': ['bigqueryapiworkspace']}

Remove Connection from Workspace

Removes the specified connection from the workspace.

  • REQUEST PATH
    • POST /v1/api/connections/remove_from_workspace
  • PARAMETERS
    • workspace_id (Integer Required): The numerical id of the workspace.
    • connection_id (String Required): The connection id.
  • RETURNS
    • HTTP 204 response.
# get the connection input details

# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/connections/remove_from_workspace"

data = {
    'workspace_id': workspace_id,
    'connection_id': connection_input_id
}

display(requests.post(url, headers=headers, json=data))

<Response [204]>

Delete Connection

Deletes the connection from the Wallaroo instance. The connection must be removed from all workspaces before it can be deleted.

  • REQUEST PATH
    • POST /v1/api/connections/delete
  • PARAMETERS
    • name (String Required): The connection name.
  • RETURNS
    • HTTP 204 response.
# get the connection input details

# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/connections/delete"

data = {
    'name': bigquery_connection_input_name
}

display(requests.post(url, headers=headers, json=data))

<Response [204]>

11 - Wallaroo MLOps API Essentials Guide: ML Workload Orchestration Management

How to use the Wallaroo API for ML Workload Orchestration Management

Wallaroo provides ML Workload Orchestrations and Tasks to automate processes in a Wallaroo instance. For example:

  • Deploy a pipeline, retrieve data through a data connector, submit the data for inferences, undeploy the pipeline
  • Replace a model with a new version
  • Retrieve shadow deployed inference results and submit them to a database

Orchestration Flow

ML Workload Orchestration flow works within 3 tiers:

TierDescription
ML Workload OrchestrationUser created custom instructions that provide automated processes that follow the same steps every time without error. Orchestrations contain the instructions to be performed, uploaded as a .ZIP file with the instructions, requirements, and artifacts.
TaskInstructions on when to run an Orchestration as a scheduled Task. Tasks can be Run Once, where is creates a single Task Run, or Run Scheduled, where a Task Run is created on a regular schedule based on the Kubernetes cronjob specifications. If a Task is Run Scheduled, it will create a new Task Run every time the schedule parameters are met until the Task is killed.
Task RunThe execution of an task. These validate business operations are successful identify any unsuccessful task runs. If the Task is Run Once, then only one Task Run is generated. If the Task is a Run Scheduled task, then a new Task Run will be created each time the schedule parameters are met, with each Task Run having its own results and logs.
Wallaroo Components

One example may be of making donuts.

  • The ML Workload Orchestration is the recipe.
  • The Task is the order to make the donuts. It might be Run Once, so only one set of donuts are made, or Run Scheduled, so donuts are made every 2nd Sunday at 6 AM. If Run Scheduled, the donuts are made every time the schedule hits until the order is cancelled (aka killed).
  • The Task Run are the donuts with their own receipt of creation (logs, etc).

Orchestration Requirements

Orchestrations are uploaded to the Wallaroo instance as a ZIP file with the following requirements:

ParameterTypeDescription
User Code(Required) Python script as .py filesIf main.py exists, then that will be used as the task entrypoint. Otherwise, the first main.py found in any subdirectory will be used as the entrypoint. If no main.py is found, the orchestration will not be accepted.
Python Library Requirements(Optional) requirements.txt file in the requirements file format.A standard Python requirements.txt for any dependencies to be provided in the task environment. The Wallaroo SDK will already be present and should not be included in the requirements.txt. Multiple requirements.txt files are not allowed.
Other artifacts Other artifacts such as files, data, or code to support the orchestration.

Zip Instructions

In a terminal with the zip command, assemble artifacts as above and then create the archive. The zip command is included by default with the Wallaroo JupyterHub service.

zip commands take the following format, with {zipfilename}.zip as the zip file to save the artifacts to, and each file thereafter as the files to add to the archive.

zip {zipfilename}.zip file1, file2, file3....

For example, the following command will add the files main.py and requirements.txt into the file hello.zip.

$ zip hello.zip main.py requirements.txt 
  adding: main.py (deflated 47%)
  adding: requirements.txt (deflated 52%)

Example requirements.txt file

dbt-bigquery==1.4.3
dbt-core==1.4.5
dbt-extractor==0.4.1
dbt-postgres==1.4.5
google-api-core==2.8.2
google-auth==2.11.0
google-auth-oauthlib==0.4.6
google-cloud-bigquery==3.3.2
google-cloud-bigquery-storage==2.15.0
google-cloud-core==2.3.2
google-cloud-storage==2.5.0
google-crc32c==1.5.0
google-pasta==0.2.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.4

Orchestration Recommendations

The following recommendations will make using Wallaroo orchestrations.

  • The version of Python used should match the same version as in the Wallaroo JupyterHub service.
  • The same version of the Wallaroo SDK should match the server. For a 2023.2.1 Wallaroo instance, use the Wallaroo SDK version 2023.2.1.
  • Specify the version of pip dependencies.
  • The wallaroo.Client constructor auth_type argument is ignored. Using wallaroo.Client() is sufficient.
  • The following methods will assist with orchestrations:
    • wallaroo.in_task() : Returns True if the code is running within an orchestration task.
    • wallaroo.task_args(): Returns a Dict of invocation-specific arguments passed to the run_ calls.
  • Orchestrations will be run in the same way as running within the Wallaroo JupyterHub service, from the version of Python libraries (unless specifically overridden by the requirements.txt setting, which is not recommended), and running in the virtualized directory /home/jovyan/.

Orchestration Code Samples

The following demonstres using the wallaroo.in_task() and wallaroo.task_args() methods within an Orchestration. This sample code uses wallaroo.in_task() to verify whether or not the script is running as a Wallaroo Task. If true, it will gather the wallaroo.task_args() and use them to set the workspace and pipeline. If False, then it sets the pipeline and workspace manually.

# get the arguments
wl = wallaroo.Client()

# if true, get the arguments passed to the task
if wl.in_task():
  arguments = wl.task_args()
  
  # arguments is a key/value pair, set the workspace and pipeline name
  workspace_name = arguments['workspace_name']
  pipeline_name = arguments['pipeline_name']
  
# False:  We're not in a Task, so set the pipeline manually
else:
  workspace_name="bigqueryworkspace"
  pipeline_name="bigquerypipeline"

ML Workload Orchestration Methods

Upload Orchestration

Uploads an orchestration to the Wallaroo workspace.

  • REQUEST PATH
    • POST multipart/form-data /v1/api/orchestration/upload
  • PARAMETERS
    • file: The file data as Content-Type application/octet-stream.
    • metadata: The metadata including the workspace_id as Content-Type application/json.
  • RETURNS
    • id (String): The id of the orchestration in UUID format.
# retrieve the authorization token
headers = wl.auth.auth_header()

url = f"{APIURL}/v1/api/orchestration/upload"

fp = open("./api_inference_orchestration.zip", "rb")

metadata = {
    "workspace_id": workspace_id
}

response = requests.post(
    url,
    headers=headers,
    files=[("file", ("api_inference_orchestration.zip", fp, "application/octet-stream")), 
         ("metadata", ("metadata", '{"workspace_id": 5}', "application/json"))],
).json()

display(response)
orchestration_id = response['id']

{'id': 'fd823818-91cf-4d78-9ec2-f74faa1a05f3'}

List Orchestrations by Workspace

Uploads an orchestration to the Wallaroo workspace.

  • REQUEST PATH
    • POST /v1/api/orchestration/list
  • PARAMETERS
    • workspace_id (Integer Required): The numerical id of the workspace.
  • RETURNS
    • List[orchestrations]: A list of the orchestrations in the workspace.
      • id (String): The The id of the orchestration in UUID format.
      • sha: The sha hash of the orchestration.
      • name (String): The name of the orchestration.
      • file_name (String): The name of the file uploaded for the orchestration.
      • task_id (String): The task id managing unpacking and installing the orchestration.
      • owner_id (String): The Keycloak ID of the user that created the orchestration.
      • created_at (String): The timestamp of when the orchestration was created.
      • updated_at (String): The timestamp of when the orchestration was updated.

Tasks

Tasks are the implementation of an orchestration. Think of the orchestration as the instructions to follow, and the Task is the unit actually doing it.

Tasks are created at the workspace level.

Create Tasks

Tasks are created from an orchestration through the following methods.

Task TypeDescription
/v1/api/task/run_onceRun the task once based.
/v1/api/task/run_scheduledRun on a schedule, repeat every time the schedule fits the task until it is killed.

Run Task Once

Run Once aka Temporary Run tasks are created from an Orchestration with the request:

  • REQUEST PATH
    • POST /v1/api/task/run_once.
  • PARAMETERS
    • name (String Required): The name of the task to create.
    • orch_id (String Required): The id of the orchestration to create the task from.
    • timeout (Integer Optional): The timeout period to run the task before cancelling it in seconds.
    • workspace_id (Integer Required): The numerical id of the workspace to create the task within.
  • RETURNS
    • List[tasks]: A list of the tasks in the workspace.
      • id (String): The The id of the orchestration in UUID format.
      • sha: The sha hash of the orchestration.
      • name (String): The name of the orchestration.
      • file_name (String): The name of the file uploaded for the orchestration.
      • task_id (String): The task id managing unpacking and installing the orchestration.
      • owner_id (String): The Keycloak ID of the user that created the orchestration.
      • created_at (String): The timestamp of when the orchestration was created.
      • updated_at (String): The timestamp of when the orchestration was updated.
# retrieve the authorization token
headers = wl.auth.auth_header()

data = {
    "workspace_id": workspace_id,
    "orch_id": orchestration_id,
    "json": {}
}

url=f"{APIURL}/v1/api/task/run_once"

response=requests.post(url, headers=headers, json=data).json()
display(response)

{'id': '1cb1b550-384f-4476-8619-03e0aca71409'}

Run Task Scheduled

Run Once aka Temporary Run tasks are created from an Orchestration with the request:

  • REQUEST PATH
    • POST /v1/api/task/run_once.
  • PARAMETERS
    • name (String Optional): The name of the task to create.
    • orch_id (String Required): The id of the orchestration to create the task from.
    • schedule (String Required): Schedule in the cron format of: hour, minute, day_of_week, day_of_month, month.
    • timeout (Integer Optional): The timeout period to run the task before cancelling it in seconds.
    • workspace_id (Integer Required): The numerical id of the workspace to create the task within.
  • RETURNS
    • id: The task id in UUID format.
# retrieve the authorization token
headers = wl.auth.auth_header()

data = {
    "workspace_id": workspace_id,
    "orch_id": orchestration_id,
    "schedule": "*/1 * * * *",
    "json": {}
}

url=f"{APIURL}/v1/api/task/run_scheduled"

response=requests.post(url, headers=headers, json=data).json()
display(response)

{'id': 'd81c6e65-3b1f-42d7-8d2f-3cfc0eb51599'}

Get Task by ID

Retrieve the task by its id.

  • REQUEST PATH
    • POST /v1/api/task/get_by_id.
  • PARAMETERS
    • id (String Required): The id of the task to retrieve.
  • RETURNS (partial list)
    • name (String|None): The name of the task.
    • id (String): The The id of the task in UUID format.
    • image (String): The Docker image used to run the task.
    • image_tag (String): The Docker tag for the image used to run the task.
    • bind_secrets (List[String]): The service secrets used to run the task.
    • extra_env_vars (Dict): The additional variables used to run the task.
    • auth_init (Bool Default: True): Whether the authorization to run this task is automatically enabled. This allows the task to use Wallaroo resources.
    • status (String): The status of the task. Status are: pending, started, and failed.
    • workspace_id: The workspace the task is connected to.
    • killed: Whether the task has been issued the kill request.
    • created_at (String): The timestamp of when the orchestration was created.
    • updated_at (String): The timestamp of when the orchestration was updated.
    • last_runs (List[runs]): List of previous runs that display the run_id, status, and created_at.
# retrieve the authorization token
headers = wl.auth.auth_header()

url=f"{APIURL}/v1/api/task/get_by_id"

data = {
    "id": task_id
}

response=requests.post(url, headers=headers, json=data).json()
display(response)

{'name': None,
 'id': '1cb1b550-384f-4476-8619-03e0aca71409',
 'image': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/arbex-orch-deploy',
 'image_tag': 'v2023.2.0-main-3228',
 'bind_secrets': ['minio'],
 'extra_env_vars': {'MINIO_URL': 'http://minio.wallaroo.svc.cluster.local:9000',
  'ORCH_OWNER_ID': '3aac9f67-3050-4502-915f-fc2f871ee350',
  'ORCH_SHA': 'd3b93c9f280734106376e684792aa8b4285d527092fe87d89c74ec804f8e169e',
  'TASK_ID': '1cb1b550-384f-4476-8619-03e0aca71409'},
 'auth_init': True,
 'workspace_id': 5,
 'flavor': 'exec_orch_oneshot',
 'reap_threshold_secs': 900,
 'exec_type': 'job',
 'status': 'pending',
 'input_data': {},
 'killed': False,
 'created_at': '2023-05-16T15:59:57.496421+00:00',
 'updated_at': '2023-05-16T15:59:57.496421+00:00',
 'last_runs': []}

Get Tasks by Orchestration SHA

Tasks tied to the same orchestration are retrieved through the following request.

  • REQUEST
    • POST /v1/api/task/get_tasks_by_orch_sha
  • PARAMETERS
    • sha: The orchestrations SHA hash.
  • RETURNS
    • ids: List[string] List of tasks by UUID.
# retrieve the authorization token
headers = wl.auth.auth_header()

url=f"{APIURL}/v1/api/task/get_tasks_by_orch_sha"

data = {
    "sha": orchestration_sha
}

response=requests.post(url, headers=headers, json=data).json()
display(response)
{'ids': ['2424a9a7-2331-42f6-bd84-90643386b130',
  'c868aa44-f7fe-4e3d-b11d-e1e6af3ec150',
  'a41fe4ae-b8a4-4e1f-a45a-114df64ae2bc']}

List Tasks in Workspace

Retrieve all tasks in a workspace.

  • REQUEST PATH
    • POST /v1/api/task/list.
  • PARAMETERS
    • workspace_id (String Required): The numerical id of the workspace.
  • RETURNS (partial list)
    • List[tasks]: A list of the tasks with the following attributes.
      • name (String|None): The name of the task.
      • id (String): The The id of the task in UUID format.
      • image (String): The Docker image used to run the task.
      • image_tag (String): The Docker tag for the image used to run the task.
      • bind_secrets (List[String]): The service secrets used to run the task.
      • extra_env_vars (Dict): The additional variables used to run the task.
      • auth_init (Bool Default: True): Whether the authorization to run this task is automatically enabled. This allows the task to use Wallaroo resources.
      • status (String): The status of the task. Status are: pending, started, and failed.
      • workspace_id: The workspace the task is connected to.
      • killed: Whether the task has been issued the kill request.
      • created_at (String): The timestamp of when the orchestration was created.
      • updated_at (String): The timestamp of when the orchestration was updated.
      • last_runs (List[runs]): List of previous runs that display the run_id, status, and created_at.
# retrieve the authorization token
headers = wl.auth.auth_header()

url=f"{APIURL}/v1/api/task/list"

data = {
    "workspace_id": workspace_id
}

response=requests.post(url, headers=headers, json=data).json()
display(response)

[{'name': None,
  'id': '33a2b093-b34f-46f1-9c59-7c4d7c0f3188',
  'image': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/arbex-orch-deploy',
  'image_tag': 'v2023.2.0-main-3228',
  'bind_secrets': ['minio'],
  'extra_env_vars': {'MINIO_URL': 'http://minio.wallaroo.svc.cluster.local:9000',
   'ORCH_OWNER_ID': '3aac9f67-3050-4502-915f-fc2f871ee350',
   'ORCH_SHA': 'd3b93c9f280734106376e684792aa8b4285d527092fe87d89c74ec804f8e169e',
   'TASK_ID': '33a2b093-b34f-46f1-9c59-7c4d7c0f3188'},
  'auth_init': True,
  'workspace_id': 5,
  'flavor': 'exec_orch_oneshot',
  'reap_threshold_secs': 900,
  'exec_type': 'job',
  'status': 'started',
  'input_data': {},
  'killed': False,
  'created_at': '2023-05-16T14:23:56.062627+00:00',
  'updated_at': '2023-05-16T14:24:06.556855+00:00',
  'last_runs': [{'run_id': 'da1ffa75-e3c4-40bc-9bac-001f430c61d2',
    'status': 'success',
    'created_at': '2023-05-16T14:24:02.364998+00:00'}]
    }
]

Kill Task

Kills the task so it will not generate a new Task Run. Note that a Task set to Run Scheduled will generate a new Task Run each time the schedule parameters are met until the Task is killed. A Task set to Run Once will generate only one Task Run, so does not need to be killed.

  • REQUEST PATH
    • POST /v1/api/task/kill.
  • PARAMETERS
    • id (String Required): The id of th task.
  • RETURNS (partial list)
    • name (String|None): The name of the task.
    • id (String): The The id of the task in UUID format.
    • image (String): The Docker image used to run the task.
    • image_tag (String): The Docker tag for the image used to run the task.
    • bind_secrets (List[String]): The service secrets used to run the task.
    • extra_env_vars (Dict): The additional variables used to run the task.
    • auth_init (Bool Default: True): Whether the authorization to run this task is automatically enabled. This allows the task to use Wallaroo resources.
    • status (String): The status of the task. Status are: pending, started, and failed.
    • workspace_id: The workspace the task is connected to.
    • killed: Whether the task has been issued the kill request.
    • created_at (String): The timestamp of when the orchestration was created.
    • updated_at (String): The timestamp of when the orchestration was updated.
    • last_runs (List[runs]): List of previous runs that display the run_id, status, and created_at.
# retrieve the authorization token
headers = wl.auth.auth_header()

url=f"{APIURL}/v1/api/task/kill"

data = {
    "id": scheduled_task_id
}

response=requests.post(url, headers=headers, json=data).json()
display(response)

{'name': None,
 'id': 'd81c6e65-3b1f-42d7-8d2f-3cfc0eb51599',
 'image': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/arbex-orch-deploy',
 'image_tag': 'v2023.2.0-main-3228',
 'bind_secrets': ['minio'],
 'extra_env_vars': {'MINIO_URL': 'http://minio.wallaroo.svc.cluster.local:9000',
  'ORCH_OWNER_ID': '3aac9f67-3050-4502-915f-fc2f871ee350',
  'ORCH_SHA': 'd3b93c9f280734106376e684792aa8b4285d527092fe87d89c74ec804f8e169e',
  'TASK_ID': 'd81c6e65-3b1f-42d7-8d2f-3cfc0eb51599'},
 'auth_init': True,
 'workspace_id': 5,
 'schedule': '*/1 * * * *',
 'reap_threshold_secs': 900,
 'status': 'pending_kill',
 'input_data': {},
 'killed': False,
 'created_at': '2023-05-16T16:01:15.947982+00:00',
 'updated_at': '2023-05-16T16:01:16.456502+00:00',
 'last_runs': [{'run_id': '3ff90356-b7b3-4909-b5c0-963acd19f3f5',
   'status': 'success',
   'created_at': '2023-05-16T16:02:02.159514+00:00'}]}

Task Runs

Task Runs are generated from a Task. If the Task is Run Once, then only one Task Run is generated. If the Task is a Run Scheduled task, then a new Task Run will be created each time the schedule parameters are met, with each Task Run having its own results and logs.

Task Last Runs History

The history of a task, which each deployment of the task is known as a task run is retrieved with the Task last_runs method that takes the following arguments. It returns the reverse chronological order of tasks runs listed by updated_at.

  • REQUEST
    • POST /v1/api/task/list_task_runs
  • PARAMETERS
    • task_id: The numerical identifier of the task.
    • status: Filters the task history by the status. If all, returns all statuses. Status values are:
      • running: The task has started.
      • failure: The task failed.
      • success: The task completed.
    • limit: The number of tasks runs to display.
  • RETURNS
    • ids: List of task runs ids in UUID.
# retrieve the authorization token
headers = wl.auth.auth_header()

url=f"{APIURL}/v1/api/task/list_task_runs"

data = {
    "task_id": task_id
}

response=requests.post(url, headers=headers, json=data).json()
task_run_id = response[0]['run_id']
display(response)
[{'task': 'c868aa44-f7fe-4e3d-b11d-e1e6af3ec150',
  'run_id': '96a7f85f-e30c-40b5-9185-0dee5bd1a15e',
  'status': 'success',
  'created_at': '2023-05-22T21:08:33.112805+00:00',
  'updated_at': '2023-05-22T21:08:33.112805+00:00'}]

Get Task Run Logs

Logs for a task run are retrieved through the following process.

  • REQUEST
    • POST /v1/api/task/get_logs_for_run
  • PARAMETERS
    • id: The numerical identifier of the task run associated with the orchestration.
    • lines: The number of log lines to retrieve starting from the end of the log.
  • RETURNS
    • logs: Array of log entries.
# retrieve the authorization token
headers = wl.auth.auth_header()

url=f"{APIURL}/v1/api/task/get_logs_for_run"

data = {
    "id": task_run_id
}

response=requests.post(url, headers=headers, json=data).json()
display(response)
    {'logs': ["2023-05-22T21:09:17.683428502Z stdout F {'pipeline_name': 'apipipelinegsze', 'workspace_name': 'apiorchestrationworkspacegsze'}",
      '2023-05-22T21:09:17.683489102Z stdout F Getting the workspace apiorchestrationworkspacegsze',
      '2023-05-22T21:09:17.683497403Z stdout F Getting the pipeline apipipelinegsze',
      '2023-05-22T21:09:17.683504003Z stdout F Deploying the pipeline.',
      '2023-05-22T21:09:17.683510203Z stdout F Performing sample inference.',
      '2023-05-22T21:09:17.683516203Z stdout F                      time  ... check_failures',
      '2023-05-22T21:09:17.683521903Z stdout F 0 2023-05-22 21:08:37.779  ...              0',
      '2023-05-22T21:09:17.683527803Z stdout F ',
      '2023-05-22T21:09:17.683533603Z stdout F [1 rows x 4 columns]',
      '2023-05-22T21:09:17.683540103Z stdout F Undeploying the pipeline']}

12 - Wallaroo MLOps API Essentials Guide: Inference Management

How to use Wallaroo MLOps Api for inferencing

Deployed pipelines have their own Inference URL that accepts HTTP POST submissions.

For connections that are external to the Kubernetes cluster hosting the Wallaroo instance, model endpoints must be enabled.

HTTP Headers

The following headers are required for connecting the the Pipeline Deployment URL:

  • Authorization: This requires the JWT token in the format 'Bearer ' + token. For example:

    Authorization: Bearer abcdefg==
    
  • Content-Type:

  • For DataFrame formatted JSON:

    Content-Type:application/json; format=pandas-records
    
  • For Arrow binary files, the Content-Type is application/vnd.apache.arrow.file.

    Content-Type:application/vnd.apache.arrow.file
    
  • IMPORTANT NOTE: Verify that the pipeline deployed has status Running before attempting an inference.

# Retrieve the token
headers = wl.auth.auth_header()

# set Content-Type type
headers['Content-Type']='application/json; format=pandas-records'

## Inference through external URL using dataframe

# retrieve the json data to submit
data = [
    {
        "tensor":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
]

# submit the request via POST, import as pandas DataFrame
response = pd.DataFrame.from_records(
    requests.post(
        deployurl, 
        json=data, 
        headers=headers)
        .json()
    )

display(response)
timeinoutcheck_failuresmetadata
01684356836285{'tensor': [1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756]}{'dense_1': [0.0014974177]}[]{'last_model': '{"model_name":"apimodel","model_sha":"bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507"}', 'pipeline_version': '', 'elapsed': [163502, 309804]}