.

.

Wallaroo SDK Upload Tutorials: Hugging Face

How to upload different Hugging Face models to Wallaroo.

The following tutorials cover how to upload sample Hugging Face models. The complete list of supported Hugging Face pipelines are as follows.

ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.34.1
  • diffusers==0.14.0
  • accelerate==0.23.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
  • Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION aka hugging-face-automatic-speech-recognition
RuntimeContainerized flight

During the model upload process, the Wallaroo instance will attempt to convert the model to a Native Wallaroo Runtime. If unsuccessful based , it will create a Wallaroo Containerized Runtime for the model. See the model deployment section for details on how to configure pipeline resources based on the model’s runtime.

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING_FACE_IMAGE_TO_TEXT
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION
  • Framework.HUGGING_FACE_SUMMARIZATION
  • Framework.HUGGING_FACE_TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING_FACE_FEATURE_EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_IMAGE_CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_IMAGE_SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_IMAGE_TO_TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_OBJECT_DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_QUESTION_ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_TEXT_CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_SENTIMENT_ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING_FACE_TEXT_GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION

Sample input and output schema.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32())), # required: the audio stored in numpy arrays of shape (num_samples,) and data type `float32`
    pa.field('return_timestamps', pa.string()) # optional: return start & end times for each predicted chunk
]) 

output_schema = pa.schema([
    pa.field('text', pa.string()), # required: the output text corresponding to the audio input
    pa.field('chunks', pa.list_(pa.struct([('text', pa.string()), ('timestamp', pa.list_(pa.float32()))]))), # required (if `return_timestamps` is set), start & end times for each predicted chunk
])

1 - Wallaroo API Upload Tutorial: Hugging Face Zero Shot Classification

How to upload a Hugging Face Zero Shot Classification model to Wallaroo via the MLOps API.

The Wallaroo 101 tutorial can be downloaded as part of the Wallaroo Tutorials repository.

Wallaroo Model Upload via MLops API: Hugging Face Zero Shot Classification

The following tutorial demonstrates how to upload a Hugging Face Zero Shot model to a Wallaroo instance.

Tutorial Goals

Demonstrate the following:

  • Upload a Hugging Face Zero Shot Model to a Wallaroo instance.
  • Create a pipeline and add the model as a pipeline step.
  • Perform a sample inference.

Prerequisites

  • A Wallaroo version 2023.2.1 or above instance

References

Tutorial Steps

Import Libraries

import json
import os
import requests
import base64

import wallaroo
from wallaroo.pipeline   import Pipeline
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework

import pyarrow as pa
import numpy as np
import pandas as pd

Connect to Wallaroo

To perform the various Wallaroo MLOps API requests, we will use the Wallaroo SDK to generate the necessary tokens. For details on other methods of requesting and using authentication tokens with the Wallaroo MLOps API, see the Wallaroo API Connection Guide.

This is accomplished using the wallaroo.Client() command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client(). For more information on Wallaroo Client settings, see the Client Connection guide.

wl = wallaroo.Client()

Variables

The following variables will be set for the rest of the tutorial to set the following:

  • Wallaroo Workspace
  • Wallaroo Pipeline
  • Wallaroo Model name and path
  • Wallaroo Model Framework
  • The DNS prefix and suffix for the Wallaroo instance.

To allow this tutorial to be run multiple times or by multiple users in the same Wallaroo instance, a random 4 character prefix will be added to the workspace, pipeline, and model.

Verify that the DNS prefix and suffix match the Wallaroo instance used for this tutorial. See the DNS Integration Guide for more details.

import string
import random

# make a random 4 character suffix to prevent overwriting other user's workspaces
suffix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))
workspace_name = f'hugging-face-zero-shot-api{suffix}'
pipeline_name = f'hugging-face-zero-shot'
model_name = f'zero-shot-classification'
model_file_name = "./models/model-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip"
framework = "hugging-face-zero-shot-classification"

wallarooPrefix = "YOUR PREFIX."
wallarooPrefix = "YOUR SUFFIX"

APIURL=f"https://{wallarooPrefix}api.{wallarooSuffix}"
APIURL
'https://doc-test.api.wallarooexample.ai'

Create the Workspace

In a production environment, the Wallaroo workspace that contains the pipeline and models would be created and deployed. We will quickly recreate those steps using the MLOps API.

Workspaces are created through the MLOps API with the /v1/api/workspaces/create command. This requires the workspace name be provided, and that the workspace not already exist in the Wallaroo instance.

Reference: MLOps API Create Workspace

# Retrieve the token
headers = wl.auth.auth_header()

# set Content-Type type
headers['Content-Type']='application/json'

# Create workspace
apiRequest = f"{APIURL}/v1/api/workspaces/create"

data = {
  "workspace_name": workspace_name
}

response = requests.post(apiRequest, json=data, headers=headers, verify=True).json()
display(response)
# Stored for future examples
workspaceId = response['workspace_id']
{'workspace_id': 9}

Upload the Model

  • Endpoint:
    • /v1/api/models/upload_and_convert
  • Headers:
    • Content-Type: multipart/form-data
  • Parameters
    • name (String Required): The model name.
    • visibility (String Required): Either public or private.
    • workspace_id (String Required): The numerical ID of the workspace to upload the model to.
    • conversion (String Required): The conversion parameters that include the following:
      • framework (String Required): The framework of the model being uploaded. See the list of supported models for more details.
      • python_version (String Required): The version of Python required for model.
      • requirements (String Required): Required libraries. Can be [] if the requirements are default Wallaroo JupyterHub libraries.
      • input_schema (String Optional): The input schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.
      • output_schema (String Optional): The output schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.

Set the Schemas

The input and output schemas will be defined according to the Wallaroo Hugging Face schema requirements. The inputs are then base64 encoded for attachment in the API request.

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
encoded_input_schema = base64.b64encode(
                bytes(input_schema.serialize())
            ).decode("utf8")
encoded_output_schema = base64.b64encode(
                bytes(output_schema.serialize())
            ).decode("utf8")

Build the Request

We will now build the request to include the required data. We will be using the workspaceId returned when we created our workspace in a previous step, specifying the input and output schemas, and the framework.

metadata = {
    "name": model_name,
    "visibility": "private",
    "workspace_id": workspaceId,
    "conversion": {
        "framework": framework,
        "python_version": "3.8",
        "requirements": []
    },
    "input_schema": encoded_input_schema,
    "output_schema": encoded_output_schema,
}

Upload Model API Request

Now we will make our upload and convert request. The model is is stored for the next set of steps.

headers = wl.auth.auth_header()

files = {
    'metadata': (None, json.dumps(metadata), "application/json"),
    'file': (model_name, open(model_file_name,'rb'),'application/octet-stream')
}

response = requests.post(f'{APIURL}/v1/api/models/upload_and_convert', 
                         headers=headers, 
                         files=files)
print(response.json())
{'insert_models': {'returning': [{'models': [{'id': 9}]}]}}
# Get the model details

# Retrieve the token
headers = wl.auth.auth_header()

# set Content-Type type
headers['Content-Type']='application/json'

apiRequest = f"{APIURL}/v1/api/models/list_versions"

data = {
  "model_id": model_name,
  "models_pk_id" : modelId
}

status = None

while status != 'ready':
    response = requests.post(apiRequest, json=data, headers=headers, verify=True).json()
    # verify we have the right version
    display(model)
    model = next(model for model in response if model["id"] == modelId)
    display(model)
    status = model['status']
{'sha': '3dcc14dd925489d4f0a3960e90a7ab5917ab685ce955beca8924aa7bb9a69398',
 'models_pk_id': 7,
 'model_version': '284a2e7c-679a-40cc-9121-2747b81a0228',
 'owner_id': '""',
 'model_id': 'zero-shot-classification',
 'id': 7,
 'file_name': 'zero-shot-classification',
 'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3509',
 'status': 'ready'}

{‘sha’: ‘3dcc14dd925489d4f0a3960e90a7ab5917ab685ce955beca8924aa7bb9a69398’,
‘models_pk_id’: 7,
‘model_version’: ‘284a2e7c-679a-40cc-9121-2747b81a0228’,
‘owner_id’: ‘""’,
‘model_id’: ‘zero-shot-classification’,
‘id’: 7,
‘file_name’: ‘zero-shot-classification’,
‘image_path’: ‘proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3509’,
‘status’: ‘ready’}

Model Upload Complete

With that, the model upload is complete and can be deployed into a Wallaroo pipeline.

2 - Wallaroo SDK Upload Tutorial: Hugging Face Zero Shot Classification

How to upload a Hugging Face Zero Shot Classification model to Wallaroo via the Wallaroo SDK.

The Wallaroo 101 tutorial can be downloaded as part of the Wallaroo Tutorials repository.

Wallaroo Model Upload via the Wallaroo SDK: Hugging Face Zero Shot Classification

The following tutorial demonstrates how to upload a Hugging Face Zero Shot model to a Wallaroo instance.

Tutorial Goals

Demonstrate the following:

  • Upload a Hugging Face Zero Shot Model to a Wallaroo instance.
  • Create a pipeline and add the model as a pipeline step.
  • Perform a sample inference.

Prerequisites

  • A Wallaroo version 2023.2.1 or above instance.

References

Tutorial Steps

Import Libraries

The first step is to import the libraries we’ll be using. These are included by default in the Wallaroo instance’s JupyterHub service.

import json
import os

import wallaroo
from wallaroo.pipeline   import Pipeline
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework
from wallaroo.object import EntityNotFoundError

import os
os.environ["MODELS_ENABLED"] = "true"

import pyarrow as pa
import numpy as np
import pandas as pd

Open a Connection to Wallaroo

The next step is connect to Wallaroo through the Wallaroo client. The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

This is accomplished using the wallaroo.Client() command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client(). If logging in externally, update the wallarooPrefix and wallarooSuffix variables with the proper DNS information. For more information on Wallaroo DNS settings, see the Wallaroo DNS Integration Guide.

wl = wallaroo.Client()

Set Variables and Helper Functions

We’ll set the name of our workspace, pipeline, models and files. Workspace names must be unique across the Wallaroo workspace. For this, we’ll add in a randomly generated 4 characters to the workspace name to prevent collisions with other users’ workspaces. If running this tutorial, we recommend hard coding the workspace name so it will function in the same workspace each time it’s run.

We’ll set up some helper functions that will either use existing workspaces and pipelines, or create them if they do not already exist.

def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline

import string
import random

# make a random 4 character suffix to prevent overwriting other user's workspaces
suffix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))

suffix=''

workspace_name = f'hf-zero-shot-classification{suffix}'
pipeline_name = f'hf-zero-shot-classification'

model_name = 'hf-zero-shot-classification'
model_file_name = './models/model-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip'

Create Workspace and Pipeline

We will now create the Wallaroo workspace to store our model and set it as the current workspace. Future commands will default to this workspace for pipeline creation, model uploads, etc. We’ll create our Wallaroo pipeline to deploy our model.

workspace = get_workspace(workspace_name)
wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)

Configure Data Schemas

The following parameters are required for Hugging Face models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a Hugging Face model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, Hugging Face model Required)Set as the framework.
input_schemapyarrow.lib.Schema (Upload Method Optional, Hugging Face model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, Hugging Face model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, Hugging Face model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

The input and output schemas will be configured for the data inputs and outputs. More information on the available inputs under the official 🤗 Hugging Face source code.

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

Upload Model

The model will be uploaded with the framework set as Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION.

framework=Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION

model = wl.upload_model(model_name,
                        model_file_name,
                        framework=framework,
                        input_schema=input_schema,
                        output_schema=output_schema,
                        convert_wait=True)
model
Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a container runtime..
Model is attempting loading to a container runtime................................................successful

Ready

Namehf-zero-shot-classification
Version6953b047-bfea-424e-9496-348b1f57039f
File Namemodel-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip
SHA3dcc14dd925489d4f0a3960e90a7ab5917ab685ce955beca8924aa7bb9a69398
Statusready
Image Pathproxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.4.0-main-4005
ArchitectureNone
Updated At2023-20-Oct 15:50:58
model.config().runtime()
'flight'

Deploy Pipeline

The model is uploaded and ready for use. We’ll add it as a step in our pipeline, then deploy the pipeline. For this example we’re allocated 0.25 cpu and 4 Gi RAM to the pipeline through the pipeline’s deployment configuration.

deployment_config = DeploymentConfigBuilder() \
    .cpus(0.25).memory('1Gi') \
    .build()
pipeline = get_pipeline(pipeline_name)
# clear the pipeline if used previously
pipeline.undeploy()
pipeline.clear()
pipeline.add_model_step(model)

pipeline.deploy(deployment_config=deployment_config)
pipeline.status()

Run Inference

A sample inference will be run. First the pandas DataFrame used for the inference is created, then the inference run through the pipeline’s infer method.

input_data = {
        "inputs": ["this is a test", "this is another test"], # required
        "candidate_labels": [["english", "german"], ["english", "german"]], # optional: using the defaults, similar to not passing this parameter
        "hypothesis_template": ["This example is {}.", "This example is {}."], # optional: using the defaults, similar to not passing this parameter
        "multi_label": [False, False], # optional: using the defaults, similar to not passing this parameter
}
dataframe = pd.DataFrame(input_data)
dataframe
inputscandidate_labelshypothesis_templatemulti_label
0this is a test[english, german]This example is {}.False
1this is another test[english, german]This example is {}.False
%time
pipeline.infer(dataframe)
CPU times: user 2 µs, sys: 0 ns, total: 2 µs
Wall time: 5.48 µs
timein.candidate_labelsin.hypothesis_templatein.inputsin.multi_labelout.labelsout.scoresout.sequencecheck_failures
02023-10-20 15:52:07.129[english, german]This example is {}.this is a testFalse[english, german][0.504054605960846, 0.49594545364379883]this is a test0
12023-10-20 15:52:07.129[english, german]This example is {}.this is another testFalse[english, german][0.5037839412689209, 0.4962160289287567]this is another test0

Undeploy Pipelines

With the tutorial complete, the pipeline is undeployed to return the resources back to the cluster.

pipeline.undeploy()
Waiting for undeployment - this will take up to 45s ........................................ ok
namehf-zero-shot-classification
created2023-10-20 15:46:51.341680+00:00
last_updated2023-10-20 15:51:04.118540+00:00
deployedFalse
tags
versionsd63a6bd8-7bf0-4af1-a22a-ca19c72bde52, 66b81ce8-8ab5-4634-8e6b-5534f328ef34
stepshf-zero-shot-classification
publishedFalse