Wallaroo SDK Upload Tutorials: Hugging Face

How to upload different Hugging Face models to Wallaroo.

1: Wallaroo API Upload Tutorial: Hugging Face Zero Shot Classification
2: Wallaroo SDK Upload Tutorial: Hugging Face Zero Shot Classification

The following tutorials cover how to upload sample Hugging Face models. The complete list of supported Hugging Face pipelines are as follows.

Parameter	Description
Web Site	https://huggingface.co/models
Supported Libraries	`transformers==4.34.1` `diffusers==0.14.0` `accelerate==0.23.0` `torchvision==0.14.1` `torch==1.13.1`
Frameworks	The following Hugging Face pipelines are supported by Wallaroo. `Framework.HUGGING_FACE_FEATURE_EXTRACTION` aka `hugging-face-feature-extraction` `Framework.HUGGING_FACE_IMAGE_CLASSIFICATION` aka `hugging-face-image-classification` `Framework.HUGGING_FACE_IMAGE_SEGMENTATION` aka `hugging-face-image-segmentation` `Framework.HUGGING_FACE_IMAGE_TO_TEXT` aka `hugging-face-image-to-text` `Framework.HUGGING_FACE_OBJECT_DETECTION` aka `hugging-face-object-detection` `Framework.HUGGING_FACE_QUESTION_ANSWERING` aka `hugging-face-question-answering` `Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG` aka `hugging-face-stable-diffusion-text-2-img` `Framework.HUGGING_FACE_SUMMARIZATION` aka `hugging-face-summarization` `Framework.HUGGING_FACE_TEXT_CLASSIFICATION` aka `hugging-face-text-classification` `Framework.HUGGING_FACE_TRANSLATION` aka `hugging-face-translation` `Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION` aka `hugging-face-zero-shot-classification` `Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION` aka `hugging-face-zero-shot-image-classification` `Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION` aka `hugging-face-zero-shot-object-detection` `Framework.HUGGING_FACE_SENTIMENT_ANALYSIS` aka `hugging-face-sentiment-analysis` `Framework.HUGGING_FACE_TEXT_GENERATION` aka `hugging-face-text-generation` `Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION` aka `hugging-face-automatic-speech-recognition`
Runtime	Containerized `flight`

During the model upload process, the Wallaroo instance will attempt to convert the model to a Native Wallaroo Runtime. If unsuccessful based , it will create a Wallaroo Containerized Runtime for the model. See the model deployment section for details on how to configure pipeline resources based on the model’s runtime.

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

Framework.HUGGING_FACE_IMAGE_TO_TEXT
Framework.HUGGING_FACE_TEXT_CLASSIFICATION
Framework.HUGGING_FACE_SUMMARIZATION
Framework.HUGGING_FACE_TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_FEATURE_EXTRACTION`	Feature Extraction Pipeline Feature Extraction Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_IMAGE_CLASSIFICATION`	Image Classification Documentation Image Classification Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_IMAGE_SEGMENTATION`	Image Segmentation Documentation Image Segmentation Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_IMAGE_TO_TEXT`	Image to Text Documentation Image to Text Source Code

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_OBJECT_DETECTION`	Object Detection Documentation Object Detection Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_QUESTION_ANSWERING`	Question Answering Documentation Question Answering Source Code

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG`	Stable Diffusion Text to Image Documentation Stable Diffusion Text to Image Source Code

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_SUMMARIZATION`	Summarization Documentation Text2Text Generation Source Code.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_TEXT_CLASSIFICATION`	Text Classification Documentation Text Classification Source Code

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_TRANSLATION`	Translation Documentation Translation Generation Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION`	Zero Shot Classification Documentation Zero Shot Classification Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION`	Zero Shot Image Classification Zero Shot Image Classification Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION`	Zero Shot Object Detection Documentation Zero Shot Object Detection Source Code

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_SENTIMENT_ANALYSIS`	Hugging Face Sentiment Analysis

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_TEXT_GENERATION`	Text Generation Documentation Text Generation Source Code

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION`	Automatic Speech Recognition Documentation Automatic Speech Recognition Source Code

Sample input and output schema.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32())), # required: the audio stored in numpy arrays of shape (num_samples,) and data type `float32`
    pa.field('return_timestamps', pa.string()) # optional: return start & end times for each predicted chunk
]) 

output_schema = pa.schema([
    pa.field('text', pa.string()), # required: the output text corresponding to the audio input
    pa.field('chunks', pa.list_(pa.struct([('text', pa.string()), ('timestamp', pa.list_(pa.float32()))]))), # required (if `return_timestamps` is set), start & end times for each predicted chunk
])

1 - Wallaroo API Upload Tutorial: Hugging Face Zero Shot Classification

How to upload a Hugging Face Zero Shot Classification model to Wallaroo via the MLOps API.

The Wallaroo 101 tutorial can be downloaded as part of the Wallaroo Tutorials repository.

Wallaroo Model Upload via MLops API: Hugging Face Zero Shot Classification

The following tutorial demonstrates how to upload a Hugging Face Zero Shot model to a Wallaroo instance.

Tutorial Goals

Demonstrate the following:

Upload a Hugging Face Zero Shot Model to a Wallaroo instance.
Create a pipeline and add the model as a pipeline step.
Perform a sample inference.

Prerequisites

A Wallaroo version 2023.2.1 or above instance

References

Tutorial Steps

Import Libraries

import json
import os
import requests
import base64

import wallaroo
from wallaroo.pipeline   import Pipeline
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework

import pyarrow as pa
import numpy as np
import pandas as pd

Connect to Wallaroo

To perform the various Wallaroo MLOps API requests, we will use the Wallaroo SDK to generate the necessary tokens. For details on other methods of requesting and using authentication tokens with the Wallaroo MLOps API, see the Wallaroo API Connection Guide.

This is accomplished using the wallaroo.Client() command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client(). For more information on Wallaroo Client settings, see the Client Connection guide.

wl = wallaroo.Client()

Variables

The following variables will be set for the rest of the tutorial to set the following:

Wallaroo Workspace
Wallaroo Pipeline
Wallaroo Model name and path
Wallaroo Model Framework
The DNS prefix and suffix for the Wallaroo instance.

To allow this tutorial to be run multiple times or by multiple users in the same Wallaroo instance, a random 4 character prefix will be added to the workspace, pipeline, and model.

Verify that the DNS prefix and suffix match the Wallaroo instance used for this tutorial. See the DNS Integration Guide for more details.

import string
import random

# make a random 4 character suffix to prevent overwriting other user's workspaces
suffix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))
workspace_name = f'hugging-face-zero-shot-api{suffix}'
pipeline_name = f'hugging-face-zero-shot'
model_name = f'zero-shot-classification'
model_file_name = "./models/model-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip"
framework = "hugging-face-zero-shot-classification"

wallarooPrefix = "YOUR PREFIX."
wallarooPrefix = "YOUR SUFFIX"

APIURL=f"https://{wallarooPrefix}api.{wallarooSuffix}"
APIURL

'https://doc-test.api.wallarooexample.ai'

Create the Workspace

In a production environment, the Wallaroo workspace that contains the pipeline and models would be created and deployed. We will quickly recreate those steps using the MLOps API.

Workspaces are created through the MLOps API with the /v1/api/workspaces/create command. This requires the workspace name be provided, and that the workspace not already exist in the Wallaroo instance.

Reference: MLOps API Create Workspace

# Retrieve the token
headers = wl.auth.auth_header()

# set Content-Type type
headers['Content-Type']='application/json'

# Create workspace
apiRequest = f"{APIURL}/v1/api/workspaces/create"

data = {
  "workspace_name": workspace_name
}

response = requests.post(apiRequest, json=data, headers=headers, verify=True).json()
display(response)
# Stored for future examples
workspaceId = response['workspace_id']

{'workspace_id': 9}

Upload the Model

Endpoint:
- /v1/api/models/upload_and_convert
Headers:
- Content-Type: multipart/form-data
Parameters
- name (String Required): The model name.
- visibility (String Required): Either public or private.
- workspace_id (String Required): The numerical ID of the workspace to upload the model to.
- conversion (String Required): The conversion parameters that include the following:
  - framework (String Required): The framework of the model being uploaded. See the list of supported models for more details.
  - python_version (String Required): The version of Python required for model.
  - requirements (String Required): Required libraries. Can be [] if the requirements are default Wallaroo JupyterHub libraries.
  - input_schema (String Optional): The input schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.
  - output_schema (String Optional): The output schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.

Set the Schemas

The input and output schemas will be defined according to the Wallaroo Hugging Face schema requirements. The inputs are then base64 encoded for attachment in the API request.

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

encoded_input_schema = base64.b64encode(
                bytes(input_schema.serialize())
            ).decode("utf8")

encoded_output_schema = base64.b64encode(
                bytes(output_schema.serialize())
            ).decode("utf8")

Build the Request

We will now build the request to include the required data. We will be using the workspaceId returned when we created our workspace in a previous step, specifying the input and output schemas, and the framework.

metadata = {
    "name": model_name,
    "visibility": "private",
    "workspace_id": workspaceId,
    "conversion": {
        "framework": framework,
        "python_version": "3.8",
        "requirements": []
    },
    "input_schema": encoded_input_schema,
    "output_schema": encoded_output_schema,
}

Upload Model API Request

Now we will make our upload and convert request. The model is is stored for the next set of steps.

headers = wl.auth.auth_header()

files = {
    'metadata': (None, json.dumps(metadata), "application/json"),
    'file': (model_name, open(model_file_name,'rb'),'application/octet-stream')
}

response = requests.post(f'{APIURL}/v1/api/models/upload_and_convert', 
                         headers=headers, 
                         files=files)
print(response.json())

{'insert_models': {'returning': [{'models': [{'id': 9}]}]}}

# Get the model details

# Retrieve the token
headers = wl.auth.auth_header()

# set Content-Type type
headers['Content-Type']='application/json'

apiRequest = f"{APIURL}/v1/api/models/list_versions"

data = {
  "model_id": model_name,
  "models_pk_id" : modelId
}

status = None

while status != 'ready':
    response = requests.post(apiRequest, json=data, headers=headers, verify=True).json()
    # verify we have the right version
    display(model)
    model = next(model for model in response if model["id"] == modelId)
    display(model)
    status = model['status']

{'sha': '3dcc14dd925489d4f0a3960e90a7ab5917ab685ce955beca8924aa7bb9a69398', 'models_pk_id': 7, 'model_version': '284a2e7c-679a-40cc-9121-2747b81a0228', 'owner_id': '""', 'model_id': 'zero-shot-classification', 'id': 7, 'file_name': 'zero-shot-classification', 'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3509', 'status': 'ready'}

{‘sha’: ‘3dcc14dd925489d4f0a3960e90a7ab5917ab685ce955beca8924aa7bb9a69398’, ‘models_pk_id’: 7, ‘model_version’: ‘284a2e7c-679a-40cc-9121-2747b81a0228’, ‘owner_id’: ‘""’, ‘model_id’: ‘zero-shot-classification’, ‘id’: 7, ‘file_name’: ‘zero-shot-classification’, ‘image_path’: ‘proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3509’, ‘status’: ‘ready’}

Model Upload Complete

With that, the model upload is complete and can be deployed into a Wallaroo pipeline.

2 - Wallaroo SDK Upload Tutorial: Hugging Face Zero Shot Classification

How to upload a Hugging Face Zero Shot Classification model to Wallaroo via the Wallaroo SDK.

The Wallaroo 101 tutorial can be downloaded as part of the Wallaroo Tutorials repository.

Wallaroo Model Upload via the Wallaroo SDK: Hugging Face Zero Shot Classification

The following tutorial demonstrates how to upload a Hugging Face Zero Shot model to a Wallaroo instance.

Tutorial Goals

Demonstrate the following:

Upload a Hugging Face Zero Shot Model to a Wallaroo instance.
Create a pipeline and add the model as a pipeline step.
Perform a sample inference.

Prerequisites

A Wallaroo version 2023.2.1 or above instance.

References

Tutorial Steps

Import Libraries

The first step is to import the libraries we’ll be using. These are included by default in the Wallaroo instance’s JupyterHub service.

import json
import os

import wallaroo
from wallaroo.pipeline   import Pipeline
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework
from wallaroo.object import EntityNotFoundError

import os
os.environ["MODELS_ENABLED"] = "true"

import pyarrow as pa
import numpy as np
import pandas as pd

Open a Connection to Wallaroo

The next step is connect to Wallaroo through the Wallaroo client. The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client(). If logging in externally, update the wallarooPrefix and wallarooSuffix variables with the proper DNS information. For more information on Wallaroo DNS settings, see the Wallaroo DNS Integration Guide.

wl = wallaroo.Client()

Set Variables and Helper Functions

We’ll set the name of our workspace, pipeline, models and files. Workspace names must be unique across the Wallaroo workspace. For this, we’ll add in a randomly generated 4 characters to the workspace name to prevent collisions with other users’ workspaces. If running this tutorial, we recommend hard coding the workspace name so it will function in the same workspace each time it’s run.

We’ll set up some helper functions that will either use existing workspaces and pipelines, or create them if they do not already exist.

def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline

import string
import random

# make a random 4 character suffix to prevent overwriting other user's workspaces
suffix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))

suffix=''

workspace_name = f'hf-zero-shot-classification{suffix}'
pipeline_name = f'hf-zero-shot-classification'

model_name = 'hf-zero-shot-classification'
model_file_name = './models/model-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip'

Create Workspace and Pipeline

We will now create the Wallaroo workspace to store our model and set it as the current workspace. Future commands will default to this workspace for pipeline creation, model uploads, etc. We’ll create our Wallaroo pipeline to deploy our model.

workspace = get_workspace(workspace_name)
wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)

Configure Data Schemas

The following parameters are required for Hugging Face models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a Hugging Face model to Wallaroo.

Parameter	Type	Description
`name`	`string` (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	`string` (Required)	The path to the model file being uploaded.
`framework`	`string` (Upload Method Optional, Hugging Face model Required)	Set as the framework.
`input_schema`	`pyarrow.lib.Schema` (Upload Method Optional, Hugging Face model Required)	The input schema in Apache Arrow schema format.
`output_schema`	`pyarrow.lib.Schema` (Upload Method Optional, Hugging Face model Required)	The output schema in Apache Arrow schema format.
`convert_wait`	`bool` (Upload Method Optional, Hugging Face model Optional) (Default: True)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

The input and output schemas will be configured for the data inputs and outputs. More information on the available inputs under the official 🤗 Hugging Face source code.

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

Upload Model

The model will be uploaded with the framework set as Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION.

framework=Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION

model = wl.upload_model(model_name,
                        model_file_name,
                        framework=framework,
                        input_schema=input_schema,
                        output_schema=output_schema,
                        convert_wait=True)
model

Waiting for model loading - this will take up to 10.0min. Model is pending loading to a container runtime.. Model is attempting loading to a container runtime................................................successful

Ready

Name	hf-zero-shot-classification
Version	6953b047-bfea-424e-9496-348b1f57039f
File Name	model-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip
SHA	3dcc14dd925489d4f0a3960e90a7ab5917ab685ce955beca8924aa7bb9a69398
Status	ready
Image Path	proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.4.0-main-4005
Architecture	None
Updated At	2023-20-Oct 15:50:58

model.config().runtime()

'flight'

Deploy Pipeline

The model is uploaded and ready for use. We’ll add it as a step in our pipeline, then deploy the pipeline. For this example we’re allocated 0.25 cpu and 4 Gi RAM to the pipeline through the pipeline’s deployment configuration.

deployment_config = DeploymentConfigBuilder() \
    .cpus(0.25).memory('1Gi') \
    .build()

pipeline = get_pipeline(pipeline_name)
# clear the pipeline if used previously
pipeline.undeploy()
pipeline.clear()
pipeline.add_model_step(model)

pipeline.deploy(deployment_config=deployment_config)
pipeline.status()

Run Inference

A sample inference will be run. First the pandas DataFrame used for the inference is created, then the inference run through the pipeline’s infer method.

input_data = {
        "inputs": ["this is a test", "this is another test"], # required
        "candidate_labels": [["english", "german"], ["english", "german"]], # optional: using the defaults, similar to not passing this parameter
        "hypothesis_template": ["This example is {}.", "This example is {}."], # optional: using the defaults, similar to not passing this parameter
        "multi_label": [False, False], # optional: using the defaults, similar to not passing this parameter
}
dataframe = pd.DataFrame(input_data)
dataframe

	inputs	candidate_labels	hypothesis_template	multi_label
0	this is a test	[english, german]	This example is {}.	False
1	this is another test	[english, german]	This example is {}.	False

%time
pipeline.infer(dataframe)

CPU times: user 2 µs, sys: 0 ns, total: 2 µs
Wall time: 5.48 µs

	time	in.candidate_labels	in.hypothesis_template	in.inputs	in.multi_label	out.labels	out.scores	out.sequence	check_failures
0	2023-10-20 15:52:07.129	[english, german]	This example is {}.	this is a test	False	[english, german]	[0.504054605960846, 0.49594545364379883]	this is a test	0
1	2023-10-20 15:52:07.129	[english, german]	This example is {}.	this is another test	False	[english, german]	[0.5037839412689209, 0.4962160289287567]	this is another test	0

Undeploy Pipelines

With the tutorial complete, the pipeline is undeployed to return the resources back to the cluster.

pipeline.undeploy()

Waiting for undeployment - this will take up to 45s ........................................ ok

name	hf-zero-shot-classification
created	2023-10-20 15:46:51.341680+00:00
last_updated	2023-10-20 15:51:04.118540+00:00
deployed	False
tags
versions	d63a6bd8-7bf0-4af1-a22a-ca19c72bde52, 66b81ce8-8ab5-4634-8e6b-5534f328ef34
steps	hf-zero-shot-classification
published	False